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Chapter  1 

Introduction  and  Summary 


1.1  Summary 

The  purpose  of  this  study  was  to  develop  analytical  and  computational 
techniques  for  performance  evaluation  of  Autonomously  Guided  Platforms 
with  Multiple  Sensors  (AGPMS). 

The  fundamental  principles  used  are:  modeling  of  the  many  feedback 
sensors,  modeling  of  the  sensor  data,  advanced  estimation  and  detection 
techniques,  sensor  scheduling  problems,  regulator  theory  and  design,  stochas¬ 
tic  control  techniques,  careful  analysis  of  multiple  time  scales. 

When  multiple  sensors  are  present,  such  as  radar,  various  types  of  IR 
sensors  and  others,  one  has  to  consider  carefully  the  “fusion"  of  the  data 
from  the  various  sensors  in  a  dynamically  changing  environment.  These 
problems  are  essential  in  the  success  of  the  overall  design  and  have  not  been 
investigated  systematically  before  with  dynamic  signal  models. 

Design  of  tracking  control  loops  for  each  sensor  class  is  a  stochastic 
control  problem  (not  just  a  nonlinear  filtering  problem).  When  all  loops 
are  treated  simultaneously,  simplifications  in  the  analysis  and  the  resulting 
implementation  occur  when  one  exploits  the  different  time  scales  present  in 
the  various  feedback  loops. 

In  addition,  AGPMS  must  have  an  adaptive  control-decision:  sensors 
employed  have  diverse  performance  characteristics.  This  fact  necessitates  a 
careful  analysis  of  sensor  models  and  target  representations  in  those  sensor 
models. 

The  techniques  and  models  used  in  our  analysis  are  fairly  sophisticated, 
vis-a-vis  the  classical  treatment  of  these  problems.  In  the  classical  treatment, 
one  ignores  the  combined  performance  index  for  missile  guidance  and 
tracking  loops  which  it  the 
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mist  distance  at  interception ,  and  instead  one  considers  separately  several 
subproblems: 

(a)  selection  of  guidance  loop  configuration, 

(b)  setting  loop  gains  for  steady  state  accuracy  requirements, 

(e)  stabilization  for  acceptable  “gain  and  phase  margins”,  and 

(d)  study  effects  of  noise  and  parametric  uncertainties. 

One  iterates  through  this  sequence  of  subproblems  in  the  order  described 
until  a  satisfactory  design  is  achieved.  This  approach  has  many  deficiencies. 
In  this  research,  we  exploit  stochastic  control  and  estimation  to  study  several 
interrelated  problems. 

In  Chapter  2,  we  consider  the  design  of  pointing  and  tracking  servomech¬ 
anisms  for  a  seeker  using  an  imaging  FLIR  with  a  gimbaled  platform  from 
a  more  or  less  conventional  perspective.  We  specifically  consider  the  appli¬ 
cation  of  classical,  single-input  single-output  servo  theory  and  the  extended 
Kalman  filter  techniques.  Our  intent  is  to  establish  a  basis  for  meaningful 
comparison  of  the  performance  improvement  achieved  with  the  nonlinear 
stochastic  control  theory  which  is  the  main  subject  of  this  research  project. 
Performance  objectives  for  these  systems  are  stated  primarily  in  classical 
terms,  and  it  is  essential  to  fully  appreciate  their  intent  and  their  implica¬ 
tions  in  order  to  formulate  well  posed  stochastic  control  problems  which  are 
meaningful  in  the  context  of  this  application. 

In  Chapter  3,  we  summarize  our  research  in  stochastic  control  theory 
relevant  to  tracking  and  missile  guidance  problems.  Two  classes  of  problems 
are  addressed:  (i)  optimal  stochastic  control  of  nonlinear  systems  with  “fast” 
and  “slow”  states;  and  (ii)  stochastic  scheduling  and  stability  of  systems 
(linear  and  nonlinear)  with  Poisson  noise  disturbances  (in  the  coefficients). 

The  work  on  (i)  has  been  led  to  a  rather  complete  theory  for  singularly 
perturbed  optimal  stochastic  control  problems.  The  theory  encompasses 
several  classes  of  models,  including  systems  with  states  taking  values  in 
bounded  sets  (e.g.,  angular  variables)  and  systems  with  unbounded  states. 
Stability  criteria  for  the  “fast”  states  play  a  key  role  in  the  second  class 
of  systems.  Our  main  focus  is  on  the  existence  and  nature  of  “composite” 
control  laws  for  the  fast  and  slow  subsystems  like  those  defined  by  Chow  and 
Kokotovic  for  singularly  perturbed  deterministic  control  problems.  One  of 
the  most  important  findings  of  this  research  is  that  composite  control  laws 
for  singularly  perturbed  stochastic  control  problems  generally  do  not  exist 


in  the  simple  form  suggested  by  the  deterministic  case.  In  fact,  the  limiting 
optimal  control  law  for  the  slow  subsystem  retains  a  dependence  on  the 
states  of  the  fast  subsystem. 

Stochastic  control  problems  with  fast  and  slow  states  are  common  in  the 
design  and  evaluation  of  tracking  loops  and  missile  guidance  systems.  They 
occur  whenever  it  is  necessary  to  retain  the  interdependence  of  subsystems 
operating  on  different  time  scales  (e.g.,  sampling  rates)  such  as  the  inter¬ 
action  of  sensor  tracking  loops  and  guidance  control  loops  in  autonomously 
guided  missiles. 

The  second  class  of  problems  treated  in  this  chapter  concerns  stochastic 
dynamical  systems  with  Poisson  noise  disturbances.  These  systems  arise  as 
models  of  physical  processes  with  intermittent  noise  disturbances.  We  have 
obtained  results  on  the  control,  scheduling,  and  stability  of  such  systems. 
The  control  results  are  not  discussed  here.  The  results  on  scheduling  are 
primarily  concerned  with  the  derivation  of  optimality  conditions  and  the 
verification  that  these  conditions  are  well  posed. 

We  also  consider  the  asymptotic  stability  of  linear  systems  with  Poisson 
noise  coefficients.  Criteria  for  stability  of  the  moments  of  such  systems  have 
been  available  for  some  time.  As  is  the  case  with  diffusion  processes,  criteria 
for  almost  sure  stability  of  the  sample  paths  are  much  more  delicate.  In  the 
present  case,  a  key  result  is  a  deep  theorem  of  Furstenburg  on  the  (ergodic) 
limit  properties  of  products  of  random  matrices.  This  result  allows  us  to 
develop  an  exact  expression  for  the  asymptotic,  exponential  growth  (decay) 
rate  of  the  paths  in  terms  of  an  ergodic  measure.  We  give  several  examples 
to  illustrate  the  nature  of  the  computations  and  criteria.  We  also  give  tight 
estimates  on  the  probability  of  a  large  deviation  in  a  stable  process;  and 
we  give  a  condition  for  stabilization  of  linear  systems  with  state  and  control 
dependent  Poisson  noises. 

In  Chapter  4,  we  consider  the  problem  of  simultaneous  detection  and 
estimation  when  the  signals  corresponding  to  the  M  different  hypotheses 
can  be  modelled  as  outputs  of  M  distinct  stochastic  dynamical  systems  of 
the  Ito  type.  Under  very  mild  assumptions  on  the  models  and  on  the  cost 
structure,  we  show  that  there  exists  a  set  of  sufficient  statistics  for  the  simul¬ 
taneous  detection-estimation  problem  that  can  be  computed  recursively  by 
linear  equations.  Furthermore,  we  show  that  te  structure  of  the  detector  nd 
estimator  is  completely  determined  by  the  cost  structure.  The  methodology 
used  employs  recent  advances  in  nonlinear  filtering  and  stochastic  control  of 
partially  observed  stochastic  systems  of  the  Ito  type.  Specific  examples  and 
applications  in  radar  tracking  and  discrimination  problems  are  discussed. 
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Chapter  2 

Seeker  Pointing  and 
Tracking:  Some  Classical 
Considerations 


In  this  chapter,  we  consider  the  design  of  pointing  and  tracking  servomech¬ 
anisms  for  a  seeker  using  an  imaging  FLIR  with  a  gimbaled  platform  from  a 
more  or  less  conventional  perspective.  We  will  specifically  consider  the  appli¬ 
cation  of  classical,  single  input  single  output  servo  theory  and  the  extended 
Kalman  filter.  Our  intent  is  to  establish  a  basis  for  meaningful  comparison 
of  the  performance  improvement  achieved  with  the  nonlinear  stochastic  con¬ 
trol  theory  which  is  the  main  subject  of  this  reseach  project.  Performance 
objectives  for  these  systems  are  stated  primarily  in  classical  terms,  and  it  is 
essential  to  fully  appreciate  their  intent  and  their  implications  in  order  to 
formulate  well  posed  stochastic  control  problems  which  are  meaningful  in 
the  context  of  this  application. 

In  the  following  paragraphs,  we  first  discuss  classical  design  methods  and 
then  control  design  based  on  the  extended  Kalman  filter. 


2.1  Classical  Servomechanism  Design 

In  the  classical  SISO  approach,  the  seeker  boresight  angles  -  elevation,  0 „ 
and  azimuth,  ip,  -  are  treated  as  independent  control  loops.  We  consider 
only  the  elevation  angle  0,  loop.  Figure  1  illustrates  the  general  configuration 
of  a  servo-tracker  in  which  it  is  desired  that  the  boresight  elevation  angle 
track  the  target  line  of  sight  elevation  angle,  9t.  The  tracking  error  is  defined 


as 


t  —  Ot  —  0  g 


The  general  control  system  objectives  are  twofold:  (a)  loop  stability,  and 
(b)  error  regulation.  Loop  stability  requires,  of  course,  that  the  closed  loop 
system  eigenvalues  lie  in  an  acceptable  region  of  the  open  left  half  plane, 
and  it  is  also  typically  required  that  specified  stability  margins  (usually 
gain  and  phase  margins)  obtain.  Error  regulation  usually  refers  to  one  or  a 
combination  of  the  following  types  of  error  specifications: 

1.  Provide  acceptable  ultimate  state  error  coefficients  for  prescribed  de¬ 
terministic  target  trajectories.  A  common  example  would  be  the  re¬ 
quirement  that  e(t)  — »  0  as  £  — ►  oc  when  6t(t)  is  a  step  or  a  ramp 
function.  It  is  also  common  to  add  other  time  response  shape  require¬ 
ments,  e.g.,  rise  time  and  overshoot  specifications. 

2.  With  Ot  specified  as  a  zero  mean  random  signal  with  prescribed  power 
density  spectrum,  provide  an  acceptable  error  power  density  spectrum 
-  which  is  frequently  specified  as  an  upper  bound  over  a  given  fre¬ 
quency  band. 

For  example,  a  typical  FLIR  performance  specification  defines  normal 
dynamic  inputs  to  be  those  with  line  of  sight  rates  less  than  0.5 rad/ sec  and 
angular  accelerations  less  than  0.5 rad/ sec2  (see  Interface  Control  Document 
5801647A,  30  September  1983).  It  further  requires  that  the  line  of  sight 
angular  deviations  remain  within  the  bounds  indicated  in  Figure  2.  We 
will  consider  the  design  of  a  servomechanism  to  meet  this  deterministic 
performance  objective  and  then  examine  the  implications  of  restating  the 
design  objectives  in  terms  of  a  stochastic  control  problem. 

Figure  3  illustrates  a  choice  of  inner  loop  and  series  compensation  which 
allows  the  stated  objectives  to  be  achieved.  Various  choices  of  the  parame¬ 
ters  satisfy  the  tracking  requirement,  and  the  final  selection  would  be  made 
by  analysis  of  the  tradeoff  between  tracking  performance  and  stability  mar¬ 
gins.  Note  that  the  performance  specification  as  stated  requires  that  the 
control  loop  be  at  least  a  type  1  servomechanism.  This  guarantees  zero  ul¬ 
timate  state  error  following  step  input  signals  and  bounded  ultimate  state 
error  following  ramp  input  signals.  The  ramp  input  error  bound  is  con¬ 
trolled  by  the  lead/lag  ratio  |£.  Increasing  the  type  number  of  the  loop  or 
increasing  the  lead/lag  ratio  will  improve  the  ultimate  state  error  response 
but  substantially  reduce  stability  margins. 
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inner  loop  compensator 


Suppose  now  that  we  consider  the  following  stochastic  version  of  the 
above  design  problem.  The  target  line  of  sight  elevation  angle  is  modeled 
by  the  stochastic  differential  equation 

~l,  =  -  (2.2) 

where  v  is  a  zero  mean  Gaussian  white  noise.  The  motivation  for  such  a 
model  is  provided  in  the  report  [2].  It  is  easy  to  show  '.hat 

E{e2}  =  j  Gte(u)du  <  oo  (2.3) 

only  if  the  control  loop  is  at  least  a  type  2  servomechanism.  This  is  an 
obvious  consequence  of  the  fact  that  the  target  model  is  not  asymptotically 
stable.  It  has  important  implications,  however,  with  respect  to  the  formu¬ 
lation  of  well-posed  stochastic  control  problems  for  this  class  of  models. 


2.2  Control  Design  Based  on  the  Extended 
Kalman  Filter 

In  this  section,  we  consider  the  application  of  the  extended  Kalman  filter 
(EKF)  to  seeker  servomechanism  design.  The  general  configuration  of  the 
control  system  is  illustrated  in  Figure  4.  The  configuration  shown  is  based  on 
an  extension  of  linear  disturbance  accommodating/tracking  servomechanism 
theory  (see  Kwatny  and  Kalnitsky  [3]  and  the  references  therein).  The  EKF 
provides  continuous,  on-line  estimates  of  a  linear  target/platform  model  in 
relative  coordinates,  given  observations  involving  nonlinear  transformations 
in  the  presence  of  additive  measurement  noise.  These  estimates  are  then 
used  by  a  robust  disturbance  accommodating  servomechanism,  where  the 
controller  is  optimal  for  the  case  of  full  state  observations.  In  the  following 
paragraphs,  we  define  the  model,  describe  the  design  of  the  EKF,  and  de¬ 
scribe  the  computation  of  the  feedforward  matrix  functions  U{w)  and  X(w). 

2.2.1  The  Model 

The  model  details  depend,  of  course,  on  the  specific  configuration  of  the 
seeker.  We  consider  a  simple,  reasonably  generic  situation.  The  FLIR  is 
mounted  via  two  sets  of  gimbals  on  an  inertial  base  and  is  therefore  free  to 
rotate  about  a  fixed  point  0  in  inertial  space  about  two  axes.  We  define  the 
following  three  coordinate  systems  all  with  origin  at  0: 


rv.*  J.* . V  IL|V  fci J-A  I-.  ■♦tVfi^.*tl  >*< 


r  V- 


J 


f  •- 


F.  *-■ 


1.  the  inertial  frame  with  coordinates  X ,  K,  Z 

2.  the  target  LOS  frame  with  coordinates  x,  y,  z 

3.  the  boresight  LOS  frame  with  coordinates  x\  y*,  z' 

The  relative  position  of  any  two  reference  frames  can  be  defined  in  terms 
of  the  conventional  elevation  angle  0  and  azimuth  angle  ip.  We  will  use  the 


following  notation* 


1.  0,,  ip,  -  boresight  LOS  angles,  relative  angles  between  the  boresight 
LOS  frame  and  the  inertial  frame. 


2.  Ot,ipt  -  target  LOS  angles,  relative  angles  between  the  target  LOS 
frame  and  the  inertial  frame. 


3.  A0f,  A xpt  -  boresight/target  deviation,  relative  angles  between  the  tar¬ 
get  LOS  frame  and  the  boresight  LOS  frame. 

For  a  system  without  a  rotor  and  assuming  that  the  inertia  about  the 
x *  axis  and  z'  axis  are  the  same,  the  equations  of  motion  for  the  boresight 
angles  take  the  form 


cP 

J*dfi'P‘  =  T* 


(2.4) 


d2 


We  assume  that  the  torque  r  is  related  to  the  corresponding  control 
input  u  by  the  linear  relation 


ra  =  K'aua,  a  =  ip,0 


(2.5) 


The  Target  Kinematic  Model 

The  target  kinematics  in  inertial  space  are  defined  by 
daT{t)  =  -haT[t)dt  +  Zdw(t) 

Vt(  0  =  or(t) 

Pr(t)  =  VT(t) 

where  the  three  vectors  Sr  =  target  acceleration,  Vt  —  target  velocity, 
Pt  =  target  position,  and 


(2.6) 
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w(t)  is  a  three-vector  valued  Gaussian  process  with  independent  compo¬ 
nents  with  mean  sero  and 


£{w(ti)wT(tj)} 


=  f  /a,  i 

1  03,  c 


if  ti  =  *2 
else 


The  Platform  Kinematic  Model 

A  platform  vibration  model  is  included  for  resonant  vibration  character¬ 
istics  of  the  airframe 

ds(t)  =  Ams(t)dt  +  Bmdv(t) 
dm(t)  =  Tms(t)  (2.7) 

V  m  =  S|»  (0 

Tm{t)  =  Vm(0 

where  am  =  platform  acceleration,  Vm  =  platform  velocity,  Pm  =  platform 
position,  8  =  a  fictitious  six-vector  of  states  s.t. 


The  model  parameters  are 


Am  — 


Ai  0 

Ai 

0  Ai 


Bm  = 


0  0  0 
c  0  0 
0  0  0 
0  c  0 
0  0  0 
0  0c 


—a  —b 
b  -a 


1  0  0  0  0  0 
0  0  1  0  0  0 
0  0  0  0  1  0 


where  a  —  2f6(l  —  b  =  2jt/;  c  =  aretb2$\/l  —  f2;  /  =  resonant 

frequency  in  Ht\  0  <  £  <  1  =  damping  ratio;  art,  =  peak-to-peak  vibration 
acceleration  in  [m/sec2]. 


The  Relative  Target-Platform  Kinematic  Model 

Since  the  dimensions  of  the  state  models  for  target  and  platform  are  not 
the  same,  we  augment  the  former  with  some  trivial  states  as  follows. 

The  platform  acceleration  state  equation  can  be  written  from  Equation 
(2.7)  as 

(  af  )  =  PAmP-1  (  J  )  +  PBmV(t)  (2.9) 

where  P  is  a  6x6  permutation  matrix 


■(*i) 


We  next  write  an  equivalent  state  space  model  for  the  target  acceleration 


(2.10) 


(?)  =[-o\°]  (?)  +[o]  *» 

Let 

«<>=(?)  -(z) 

?<'>  =  {  ['oA  2]-[p  p_1]  }  l  |  -pB» 

Thus  the  relative  kinematics  can  be  written  in  the  form 
?(*)  =  Ar,i${t)  +  BrclV 

Orel{t)  =  T’rel?(t)  =  [/3IO3]?  (^) 

V  ret(f)  =  Orel(0 
=  V„,(t) 


(t) 

(2.11) 


R?S 


Brel  = 


e  0  0  0 

Oa  0  0 

0  0  0  0 
0  0  0  -c 
0  0  0  0 
0  0  0  0 


Finally,  Equation  (2.12)  is  written  compactly  as 

w(t)  =  Zw(t)  + 

where  w  is  12-vector,  £<(t)  is  6-vector,  w  =  (PTel,  Vre|,f)T  where 


(2.13) 


03  h 
Zi=  o3  03 
Os 


O3  O3 

h  Oa 

Artl 


Os 

=  --- 


The  observations:  For  simplicity  of  notation,  take  w  =  (z,  y,  z,  w)(  where 
x,y,z  are  the  relative  position  coordinates  of  target  weight  platform  in  inertial 
frame  fixed  to  the  platform  with  z-axis  pointing  down. 

The  target  location  in  the  seeker  boresight  frame  is  given  in  terms  of  the 
angles  8s,  4 'a 

J?  =  r(*.,e.)[y  1  (2.14) 


where  the  rotation  matrix  T  is  given  as 

coa9,coatt,  c 
T(t,,0a)  =  -wnl', 


coaB,costt, 

cosB,aint, 

ain  8, 

-aint. 

coat. 

0 

-ain9,cost. 

-sinB,  ain't. 

coaB, 

(2.15) 


coordinates  of  the  target  trackpoint  in  the  FLIR  image  plane  are  given  by 

fl  =  rfew*.,e.)[yl  (2-16) 


Pi  —  0  0  1  *8  a  proj^ti011  onto  the  y-z  plane  in  the  boresight  frame. 

R(t)  =  y/x 2  +  y2  +  z2  is  range  which  is  available  by  separate  laser  range 
finder  measurement 


f0  is  focal  length 

Thus  the  observation  equation  is  in  discrete  time 


r{X,Y,Z) 


y(tii)  = 


£P2T(¥„e.)  y 


(2.17) 


+  k(tk)  k  =  1,2,... 
=  h{X,Y,Z)  +  £ 


where 


y  is  a  3-vector  and  the  measurement  noise 

£  is  3-vector,  Gaussian,  zero  mean,  white  noise  process  with  Rk  =  E{£k£k} 

2.2.2  The  Extended  Kalman  Filter 

To  implement  the  EKF,  we  will  need  the  12x12  Jacobian  matrix 

dh(w) 

=  ~am 

Let  u;  =  (x,  y,  z)(  and  note  that  h  depends  only  on  w.  Define  the  3x3  matrix 


H(w)  =  [tf(u>),03l9]. 


(2.18) 


We  implement  the  EKF  in  continuous  time  but  with  the  observations  y(t*) 
available  at  discrete  times  only.  (cf.  Applied  Optimal  Estimation ,  A.  Gelb, 
pg.  188) 

&(0  =  Zib{t)  (2.19) 

P(t)  =  ZP{t)  +  P[t)ZT  +  Q 

Now  integrate  over  tk  <  t  <  t*+1,  with  initial  conditions  given  at  tk  by  the 
update  equations: 

0>{tk+)  =  &(<*-)  +  tf*[y(t*)  -  hk(w{tk-))}  (2.20) 

Define  Hk  =  H(tv)\w  =  w(tk+)  then 

P(t*+)  =  [I  -  KkHk]P(tk-)  (2.21) 

Kk  =  P{tk-)Hl  [HkP{tk-)Hl  +  Rk}~1  (2.22) 

and  the  matrix  Q  is  defined  \y 

<3  =  £{i(0it(0) 

where  rj(t )  =  Wi'(t). 

Remarks: 

1.  Equation  (2.19a)  is  a  12-dimensional  linear  differential  equation  with 
the  same  parameters,  Z,  as  in  Equation  (2.13).  It  is  the  “on-line” 
model. 

2.  Equation  (2.19b)  is  a  matrix-valued  differential  Riccati  equation  with 
symmetric  solution  P(t),  which  must  be  propagated  from  tk  to  tk+ j. 

3.  Equation  (2.20)  is  the  update  equation  of  the  on-line  model.  It  con¬ 
tains  the  “true”  nonlinearity  h(-)  as  it  appears  in  Equation  (2.17)  ex¬ 
cept  that  the  most  current  estimate  of  the  range  R(tk)  is  used  (instead 
of  e.q.  s/x1  +  y2  +  z 2). 

4.  Equation  (2.21)  updates  the  Riccati  matrix. 

5.  Equation  (2.22)  updates  the  optimal  gain  Kk  for  the  current  update 
evaluation. 


2.2.3  Computation  of  the  Feedforward  Matrices 

Let  x  represent  the  boresight  state,  i.e., 

*=(e-4e" 

and  w  the  target  state.  We  seek  a  control  input  u(t)  and  corresponding  state 
trajectory  x(t)  so  that  perfect  tracking  occurs.  That  is 

A0<(t)  =  0  (2.23) 

A4ft(t)  s  0  (2.24) 

Moreover,  we  seek  u,  x  in  the  form 

u  =  U[w),x  =  X(w)  (2.25) 

X(w)  :  Exact  tracking  requires  that 

(2.26) 

Recall  the  transformation  from  rectangular  to  polar  coordinates  [X,Y,  Z)  *-* 

X  =  RcosBtcosVt  (2.27) 

Y  =  Rcos&tsintyt 

Z  =  RsinQt 

R  =  (z2  +  j/2  +  *2)5  (2.28) 

0t  =  ein~1(Z/R) 

9t  =  tan~1(Y/X) 

Note  that  Equations  (2.28b)  and  (2.28c)  immediately  provide  0t  and  as 
functions  of  u>.  We  still  need  ^©t(w)  and  '«(«;).  To  obtain  these,  let 

Vr  =  target  inertial  velocity  in  target  LOS  coordinates 
Vr  =  target  inertial  velocity  in  inertial  frame  coordinates 
wt  =  target  LOS  frame  angular  velocity  in  target  LOS  coordinates 
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Then  we  have 


wt  =  {j^tsinBt,  £et,  j^tcosOt)1 

Vr  =  T(-et,9,)Vjt 

PiWt  =  PiVf 


These  lead  to 

4e‘ 


=  G(©(,*t)V*  = 


-ain'f. 


COS'ft 


—tanQtCosVt  —tanQtsin'it  1 


(2.29) 

(2.30) 

(2.31) 


Vk 

(2.32) 


which  provide  the  required  relations. 

U(w)  :  Exact  tracking  requires  (^9*,  =  (;g©,,  a?®*)-  Using  the 

equations  of  motion,  we  can  write 


^e„fU,)'  =  «'P 


(2.33) 


But  from  Equation  (2.32),  we  have 


It**?  =  ttdG/de‘)V*\(aG/d*t)VR}GVR  +  GaR  (2.34) 
where  aR  is  the  target  acceleration.  Thus,  we  obtain  from  (2.33)  and  (2.34) 
u  —  U{w)  =  diag  (l/K'e,l/K't)  diag  (Je,Jy)  (2.35) 

{[(ac/a©«)  v*|  (dG/dVt)  VR]  gvr  +  GaR} 
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Chapter  3 


Stochastic  Control  of 
Dynamical  Systems 

Summary: 

In  this  Chapter,  we  summarize  our  research  in  stochastic  control  theory 
relevant  to  tracking  and  missile  guidance  problems.  Two  classes  of  problems 
are  addressed:  (i)  optimal  stochastic  control  of  nonlinear  systems  with  “fast” 
and  “slow*  states;  and  (ii)  stochastic  scheduling  and  stability  of  systems 
(linear  and  nonlinear)  with  Poisson  noise  disturbances  (in  the  coefficients). 

The  work  on  (i)  has  led  to  a  rather  complete  theory  for  singularly  per¬ 
turbed  optimal  stochastic  control  problems.  The  theory  encompasses  several 
classes  of  models,  including  systems  with  states  taking  values  in  bounded 
sets  (e.g.,  angular  variables)  and  systems  with  unbounded  states.  Stability 
criteria  for  the  “fast*  states  play  a  key  role  in  the  second  class  of  systems. 
The  theory  includes  both  absorbing  (Dirichlet)  and  reflecting  (Neumann) 
boundary  conditions  for  systems  with  bounded  state  spaces.  Its  main  fo¬ 
cus  is  on  the  existence  and  nature  of  “composite*  control  laws  for  the  fast 
and  slow  subsystems  like  those  defined  by  Chow  and  Kokotovic  for  singu¬ 
larly  perturbed  deterministic  control  problems.  One  of  the  most  important 
findings  of  this  research  is  that  composite  control  laws  for  singularly  per¬ 
turbed  stochastic  control  problems  generally  do  not  exist  in  the  simple  form 
suggested  by  the  deterministic  case. 

In  general,  one  cannot  design  an  effective  feedback  control  for  the  overall 
system  (fast  and  slow  states)  based  on  optimization  of  the  natural  limiting 


system  obtained  by  a  standard  asymptotic  analysis  of  the  model.  That 
is,  one  cannot  generally  “separate”  the  processes  of  asymptotic  analysis  and 
optimization.  In  fact,  the  limiting  optimal  control  law  for  the  slow  subsystem 
retains  a  dependence  on  the  states  of  the  fast  subsystem. 

Stochastic  control  problems  with  fast  and  slow  states  are  common  in  the 
design  and  evaluation  of  tracking  loops  and  missile  guidance  systems.  They 
occur  whenever  it  is  necessary  to  retain  the  interdependence  of  subsystems 
operating  on  different  time  scales  (e.g.,  sampling  rates)  such  as  the  inter¬ 
action  of  sensor  tracking  loops  and  guidance  control  loops  in  autonomously 
guided  missiles. 

The  second  class  of  problems  treated  in  this  chapter  concerns  stochastic 
dynamical  systems  with  Poisson  noise  disturbances.  These  systems  arise  as 
models  of  physical  processes  with  intermittent  noise  disturbances.  We  have 
obtained  results  on  the  control,  scheduling,  and  stability  of  such  systems. 
The  control  results  are  not  discussed  here.  The  results  on  scheduling  are 
primarily  concerned  with  the  derivation  of  optimality  conditions  and  the 
verification  that  these  conditions  are  well-posed.  We  use  a  constructive  lim¬ 
iting  argument  developed  earlier  for  diffusion  process  models  to  obtain  the 
optimal  scheduling  policy  and  cost  as  the  limit  of  a  sequence  of  optimal 
scheduling  problems  in  which  a  finite  number  of  switchings  are  permitted. 
The  optimality  conditions  for  these  problems  are  quaai-variational  inequali¬ 
ties  (QVI’s)  introduced  for  scheduling  and  inventory  control  by  Bensoussan 
and  Lions.  The  properties  of  the  Poisson  noise  disturbances  cause  the  QVI’s 
to  be  “first  order”  and  “fully  nonlinear”  (in  contrast  to  the  classical  case  of 
diffusion  processes).  As  a  result,  their  analysis  requires  methods  interme¬ 
diate  between  those  used  for  diffusion  systems  (elliptic  models)  and  deter¬ 
ministic  systems  (first  order).  In  particular,  we  use  the  method  of  viscosity 
solutions  introduced  by  Crandall  and  Lions  to  establish  uniqueness  of  the 
optimal  cost  when  some  of  the  switching  costs  are  zero. 

We  also  consider  the  asymptotic  stability  of  linear  systems  with  Poisson 
noise  coefficients.  Criteria  for  stability  of  the  moments  of  such  systems  have 
been  available  for  some  time  (S.  Marcus).  As  is  the  case  with  diffusion  pro¬ 
cesses,  criteria  for  almost  sure  stability  of  the  sample  paths  are  much  more 
delicate.  In  the  present  case,  a  key  result  is  a  deep  theorem  of  Furstenburg 
on  the  (ergodic)  limit  properties  of  products  of  random  matrices.  This  result 
allows  us  to  to  develop  an  exact  expression  for  the  asymptotic,  exponential 
growth  (decay)  rate  of  the  paths  in  terms  of  an  ergodic  measure.  We  give 
several  examples  to  illustrate  the  nature  of  the  computations  and  criteria. 
We  also  give  tight  estimates  on  the  probability  of  a  large  deviation  in  a  sta- 


ble  process;  and  we  give  a  condition  for  stabilization  of  linear  systems  with 
state  and  control  dependent  Poisson  noises. 

In  the  first  section  we  consider  the  problem  of  optimal  stochastic  control 
of  diffusion  processes  containing  ‘‘fast”  and  “slow”  dynamics.  The  systems 
are  considered  on  an  unbounded  state  space.  The  analysis  highlights  the 
key  role  played  by  ergodicity  of  the  fast  state  variables.  We  use  a  stochastic 
stability  theorem  of  Khas’minskii  to  determine  the  conditions  under  which 
ergodicity  holds  and  the  optimal  control  problem  is  well  posed.  The  lim¬ 
iting  control  problem  obtained  as  the  small  parameter  goes  to  zero  retains 
an  interesting  interdependence  between  fast  and  slow  variables.  The  work 
reported  in  the  first  section  of  this  chapter  is  a  summary  of  a  portion  of 
[3].  That  paper  should  be  consulted  for  details  of  the  proofs  and  for  other 
related  problems  and  results. 

In  the  second  section  of  this  chapter  we  present  a  summary  of  some 
work  on  the  optimal  stochastic  scheduling  of  systems  with  jump  process 
parameters.  The  work  described  in  that  section  is  abstracted  from  the  paper 
[24].  The  main  results  are  a  characterization  of  the  optimality  conditions  in 
terms  of  viscosity  solutions  to  a  class  of  Bellman  equations. 

In  the  third  section  of  this  chapter  we  present  a  summary  of  our  research 
on  the  stability  properties  of  linear  stochastic  dynamical  systems  with  Pois¬ 
son  noise  disturbances  as  parameters.  The  main  results  in  that  section  are 
expressions  for  the  exponential  asymptotic  growth  (decay)  rates  of  the  solu¬ 
tions. 

3.1  Stochastic  Optimal  Control  of  Systems  with 
Fast  and  Slow  States 

3.1.1  Introduction 

In  this  section  we  address  the  following  class  of  control  problems.  We  have 
a  system  governed  by 

dx  =  /( x,  y,  v)dt  +  y/ldw 

(dy  =  g(x,  y,  v)dt  +  V2edb  (3.1) 

x(0)  =  x,y(0)  =  y. 

where  w  and  b  are  independent  Wiener  processes.  The  state  x(t)  represents 
the  slow  system,  while  the  state  y(t)  represents  the  fast  system.  The  scaling 
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is  such  that  the  variations  of  the  fast  system  per  unit  of  time,  in  average  as 
well  as  in  variance,  are  of  order  1/e.  The  dynamics  are  controlled  via  the 
parameter  v(t).  There  is  full  information  and  the  objective  is  to  minimize 
the  payoff 

•£*(*(•))  =  E  (3-2) 

where  r  denotes  the  first  exit  time  of  the  process  x  from  the  boundary  T 
of  a  domain  O  usually  taken  to  be  smooth  and  bounded.  (We  will,  in  fact, 
treat  systems  on  unbounded  domains.)  Call 

«t(x,y)  =  mf{J«>(.))>, 
then  u(  is  the  solution  of  the  Bellman  equation 


with 


-  A*u‘  -  ^-Ayti*  +  flu*  =  if(i,D*u*,y,  ifJyU*) 
«€  =  0  Vx  €  r 


E(x,p,y,q)  =  inf  [/(x,y,  v)  +  p  •  /(x,  y,  v) 
"ei/w 

+9 -y(z,v,v)]=:  inf  L(x,p,y,q) 


(3.3) 


(3.4) 


We  assume  sufficient  smoothness  so  that  there  exists  a  Borel  map  V  (x,  p,  y,  q) 
with  values  in  Ua i  such  that 

H(x,p,y,q)=  L(x,p,y,q,V)  (3.5) 

We  can  then  define  an  optimal  feedback  control  for  the  problem  by  set- 

i>f(x,  y)  =  V  (x,  Dzu{,  y,  Dyu()  (3.6) 

v,=vt(xt,y€)  (3.7) 

is  an  optimal  control  for  (3.2). 

Such  systems  arise  in  the  design  and  analysis  of  tracking  loop  systems 
where  the  fast  subsystem  corresponds  to  the  dynamics  of  the  sensor  control 
loop  and  the  slow  subsystem  corresponds  to  the  dynamics  of  the  platform. 
Many  other  applications  have  models  which  exhibit  similar  features. 
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Our  objective  is  to  study  the  behavior  of  the  equation  (3.3)  for  e  small, 
and  to  interpret  the  results  as  a  limit  control  problem  approximating  (3.1), 
(3.2).  Let  us  explain  the  type  of  results  which  one  can  expect. 

Proceed  formally  with  an  asymptotic  expansion 

u‘(*,y)  =  u(x)  +  c^(*,y). 

-  Au  -  Av^  +  fiu  =  H(x,  Dzti,  y,  Dv<f>)  (3.8) 

which  we  try  to  match  for  any  x,  y  by  a  convenient  choice  of  u  and  <f>. 
Consider  x  in  (3.8)  as  a  parameter,  as  well  as  p  =  Dx u;  set 

L( y,  v)  =  /(*,  V,  v)  +  p  •  f(x,  y,  v) 

G(y,v)=g(x,y,v)  (3.9) 

H(y,  q)  =  inf  [L(y,  v)  +  q  •  G(y,  v)] 

which  also  depend  parametrically  on  z  and  p. 

One  can  then  consider  the  Bellman  equation  of  ergodic  control  relative 
to  (3.9).  It  is  defined  are  follows:  pick  a  constant  \  (constant  with  respect 
to  y)  and  a  function  4>  such  that 

-Ay*  +  x  =  H(y,Dv*).  (3.10) 

Suppose  one  can  find  such  a  pair  x>  4>  depending  parametrically  on  z,p; 
hence, 

x  =  x(*,p)- 

-Au  +  /Jti  =  *(x,  Dxu),  (3.11) 

then  the  pair  u,  4>  will  satisfy  (3.8).  One  can  thus  expect  a  solution  of  (3.11), 
vanishing  on  the  boundary  T  of  O2  to  be  the  limit  of  uc. 

This  procedure  depends  on  the  possibility  of  being  able  to  solve  ergodic 
control  problems  of  the  type  (3.10).  This  control  problem  itself  is  as  follows: 
Consider 

dy  =  G(y,v)dr  +  y/2db,  y(0)  =  0  (3-12) 

Mv(-))  =  ^  fE  JQ  HVy  v)* 

then  in  general 

X  =  »nf{*y(t;(-))} 
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independent  of  y.  The  interpretation  of  ^  is  more  delicate.  Pick  a  feedback 
t>(y)  and  consider  the  controlled  state 

dy  =  G(y,  v(y))dr  +  y/2db,  y(0)  =  y.  (3.13) 

It  seems  inevitable  to  require  ergodicity  of  the  process  y  to  define  a  well- 
posed  control  problem. 

This  means  that  as  r  — »  oo,  y(r)  behaves  like  a  random  variable  following 
a  probability  mj^(y),  depending  on  the  choice  of  t>(*)  and  of  the  parameter 
entering  into  the  definition  of  G.  Suppose,  moreover,  that  m  is  a  probability 
density  with  respect  to  Lebesgue  measure;  it  is  possible  to  give  another 
interpretation  of  x  as  follows: 

X  =  inf {/  £(y»v(v))m"(  )(y)dy}.  (3.14) 

In  fact,  taking  account  of 

EL(y{r),  v{r))  -*  J ^  L{ y,  v(y))mvU{y)dy  as  r  — ►  oo 

one  understands  the  relations  between  both  interpretations  of  Formula 
(3.14)  permits  a  better  interpretation  of  (3.11),  which  turns  out  to  be  a 
Bellman  equation  for  the  slow  system. 

Indeed 

X(i,p)  =  inf{  f  (/(*,y,»(y))  +  p-  /(*,y,v(y))m"D(y)dy} 

«(•)  Jy 

Setting 

/(x,  v(-))  =  Jy  f(*,y,  v(y))m*^(y)dy 

/(x,v(-))=  f  /(*,y,v(y))m"0(y)dy 
Jy 

then  the  limit  problem  is  described  by 

inf  J(v)  =  Ex{[  e~prl(z,  v(-))dt} 

•(■)  J o 

dx  =  /(*,  v(>))dt  +  y/idw  (3.15) 

x(0)  =  0 
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It  is  interesting  to  note  that  the  set  of  controls  in  (3.15)  is  changed 
from  the  original  definition.  One  must  consider  feedback  laws  v  =  v(y). 
A  control  defined  by  a  feedback  with  respect  to  the  slow  system  is  thus  a 
function  v(z,  y).  To  justify  these  considerations,  it  is  thus  important  to  make 
assumptions  in  order  that  the  ergodicity  of  the  process  (3.13)  is  guaranteed. 
There  must  be  one  way  or  another  a  Markov  chain  defined  on  a  compact  set 
for  which  Doeblin’s  theorem  holds  (see  J.L.  Doob  [1]).  This  is  achieved  when 
one  assumes  that  G  is  periodic  in  y  together  with  the  feedback  or  when  one 
considers  instead  of  (3.13)  a  reflected  diffusion.  The  first  case  was  treated  in 
the  paper  [2].  In  this  section  we  shall  consider  the  case  of  diffusions  on  the 
whole  space.  Reflected  diffusions  are  treated  in  [3].  This  section  contains 
a  treatment  of  most  cases  where  a  natural  ergodic  fast  system  governs  the 
evolution  of  the  Btate.  There  are  other  situations  where  different  techniques 
of  singular  perturbations  are  used.  Examples  of  such  situations  may  be 
found  in  the  paper  of  R.  Jensen  and  P.L.  Lions  [4].  For  other  approaches  to 
ergodic  control,  see  [5]. 

Acknowledgement:  This  is  joint  research  with  A.  Bensoussan  of  IN- 
RIA. 

3.1.2  Ergodic  control  for  diffusions  in  the  whole  space 
Assumptions  -  Notation 
We  consider 

ff(y,  v)  :  Jld  x  U  ->  Zd 

l{y,v):RdxU^Zd  (3.16) 

continuous  and  bounded 

Ua<i  (compact)  c  U  (3-17) 

U  a  metric  space.  For  a  given  feedback  v(y),  which  is  a  Borel  function 
with  values  in  Ua<t,  we  shall  solve  in  a  weak  sense  the  stochastic  differential 
equation 

dy  =  {Fy  +  y(y,  v(y))dt  +  y/2dbr(t ),  y(0)  =  y.  (3.18) 

The  linear  term  Fy  will  be  useful  to  ensure  an  ergodicity  property  later  on 
(Fa  stable  matrix).  The  Brownian  motion  bT  is  defined  through  a  Girsanov 
transformation.  We  can  find  a  system  (fl,  A,  F\  pv)  such  that  (3.18)  holds.1 

‘We  limit  ourselves  to  feedback  control*,  since  only  those  will  appear  in  the  singular 
perturbation  problem  that  we  shall  eventually  solve.  Of  course,  this  is  not  at  all  necessary 
for  the  ergodic  control  itself. 
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We  then  consider  the  function 

*“(t>(-))  =  El  J~  e-°‘/(y  (0,  *(«))*  (3-19) 

where 

«(0  =  w(y(0)> 

and  we  set 

*.(*)  =  inf  *-(*(•)).  (3.20) 

»(•) 

Setting  A  =  -A  -  Fy  •  D,  we  can  assert  that  <f>a  is  the  solution  of 

A<pa  +  ot<pa  =  H{y,D<pa)  (3.21) 

<pa  bounded  ,  <pa  €  fV2j>,,t( Zn),2  <  p  <  oo 
where  WJJ’’,*(J?n)  denotes  a  Sobolev  space  with  weight 

P»(y)  =  <r'*(1+lv|,),/J  (3.22) 

and 

V+  =  {*(  y)\zp*  €&(**)} 

W*™  =  {:€  e  L*"} 

dy,  dy,dyy 


Invariant  measures 

Since  the  diffusion  y(.)  does  not  lie  in  a  compact  set,  some  assumptions  on 
the  drift  g  are  necessary  to  ensure  ergodicity.  We  shall  mainly  use  the  results 
of  Khas’minskii  [6].  We  make  the  following  assumption: 

(A)  There  exists  a  bounded  smooth  domain  D  and  a  function  ip  which  is 
continuous  and  locally  bounded  on  Z*  -  D,  >  0,  ip  €  Wf* (Zd  -  D),  and 

Aip  -  g{y,v)  •  Dip  >  l,Vv,y  €  fld  -  D  (3.23) 


ip  >  0  ip  —*  oo  as  |y| 


oo  and 


1 DiPl 


bounded 


In  general,  one  can  try  to  find  tp  of  the  form 


V’(y)  =  logQ(y)  +  fc 


(3.24) 


QlV )  =  -My-y  +  m-y  +  p 


(3.25) 


M  symmetric  and  positive  definite  and  Q  >  0;  and  D  is  a  region  containing 
the  zeros  of  Q. 

The  following  condition  must  hold  to  have  (3.23): 


\My  +  m|a 
iMy1  +  m  •  y  +  p 


—  tr  M 


(3.26) 


~(Fy  +  g(y,v))  ■  (My  +  m) 

>  j-My  •  y  +  m  ■  y  +  p,  Vy  e  Zd  -  D; 

m 

for  a  convenient  choice  of  M,  m ,  and  p.  For  instance,  if  d  =  2,  we  can  take 
M  =  J, m  =  0,  p  =  0  and  (3.24)  is  satisfied  provided  that,  for  instance 


F<(---A)I 


(3.27) 


and  D  is  a  sufficiently  large  neighborhood  of  0. 

Consider  a  domain  Di  such  that  I)  C  Di,Di  smooth  and  bounded.  Let 
r  and  be  the  boundaries  of  D,Dj,  respectively.  We  shall  construct  a 
Markov  chain  on  Tj.  Let  x  €  Jtd,  we  define 


$'(x;  fl)  =  inf{t|y,(t)  6  D} 

0(x\  0)  =  inf{t  >  $'(x;  n)|y*(t)  g  DJ 


(3.28) 

(3.29) 


In  (3.28),  (3.29)  y ,(t)  is  the  diffusion  (3.18)  with  initial  condition  x.  Using 
V»(x),  we  can  write 

E;$'(x)  <  (*).  (3.30) 


This  implies  also  that  the  exterior  Dirichlet  problem 

At)  -  g{ y,  v(y))  •  Dr)  =  0,  y  €  £' 4  -  D 

»?|r  =  h,h€  L°°(r) 

has  a  bounded  solution  given  explicitly  by 

fj(x)  =  E‘h(yz(0'  (x))). 


(3.31) 


(3.32) 


$ 

& 


The  Markov  chain  on  Ti  is  then  constructed  as  follows.  We  define  two 
sequences  of  stopping  times  (relative  to  F*), 

ro,  *"ii  *"2>  •  •  • 

i  i 

r  i,r  2,... 

such  that 

To  =  0 

rn  =  inf{t  >  r'„|y(t)  g  Dx},n  >  1 
r'n+1  =  inf{t  >  r„|y(t)  6  D},n  >  0 

The  process  y(t)  in  the  brackets  is  the  process  defined  by  (3.18),  i.e.,  with 
initial  condition  y.  Let  us  set  Yn  =  y(r„),n  >  1.  Then  Yn  G  Ti  and  is  a 
Markov  chain  with  transition  probability  defined  by 

El  [^+1)1^)  =  El4>{yx{6{x)))\*=Yn.  (3.33) 

We  define  the  following  operator  on  Borel  bounded  functions  on  rx 

p  4>{x)  =  EMvM*)))  (3-34) 

We  can  give  an  analytic  formula  as  follows.  Consider  the  problem 


-  9(y.  «(y))  D(-0  in  Dx,  f  |r,  =  <f>- 


(3.35) 


We  first  note  that 


El<Kyz(0{x)))  =  E*vs(yz(0\x)) 


therefore  taking  account  of  (3.32)  ,  we  have 

P  4>{x)  =  t)(x) 


(3.36) 


where  rj  denotes  the  solution  of  (3.31)  corresponding  to  the  boundary  con¬ 
dition  h  =  £.  Of  course,  in  (3.36)  x  €  Tx  are  the  only  relevant  points.  We 
then  have 

Lemma  1.1.  The  operator  P  is  ergodie. 

Proof.  See  [3]  for  the  proof  of  this  and  all  the  remaining  lemmas  in  this 
section. 

From  ergodie  theory,  it  follows 

|P"*(y)  -  /  <  KMe-",xe  r,  (3.37) 
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where  K ,  p  are  uniform  with  respect  to  the  feedback  control  «(•),  and  x  =  ar" 
denotes  the  invariant  probability  on  IV 
It  follows  that,  since 


?n<t>{y)  =  EM  y(r„)) 


we  can  write 


(3.38) 


WMrn))  ~  j*{r>)*{do)\  <  K\\4\\T^. 

We  can  then  define  a  probability  on  Zd,  by  the  formula 

«■» 

VA  Borel  bounded  in  R.d. 

Following  Khas’minskii,  one  can  then  prove  that  the  invariant  probability  is 
unique,  has  a  density  with  respect  to  Lebesgue  measure,  denoted  by  m  =  mv 
which  is  the  solution  of 


A’m  +  div(myv)  =  0,  m  >  0, 
J  m(y)dy  =  1. 


(3.40) 


where 

A*  =  -A  +  div(Fy). 
Consider  now  the  Cauchy  problem 

^  +  Ax  -  gvDz  =  0 


(3.41) 


Lemma  1.2.  We  have 


z(y,0)  =  <f>[y) 


*(y,  i)  <  c\<t>\Ll 


(3.42) 


We  deduce  from  Lemma  1.2  an  estimate  on  the  invariant  probability 
solution  of  (3.40).  Using 

(.  ™v{.v)*{y,  l)dy  =  f  mv(y)(f>(y)dy 
J  K*  J  Z* 
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we  deduce  easily  that 

m^y)  <  A,Vy,Vt>(-)  (3.43) 

It  follows  that  mv  is  uniformly  bounded  in  Lp(Zd),'^p,  1  <  p  <  oo.  Let  6 
be  an  element  of  we  have 


-  A (mO)  +  div(m0y)  +  Fy  •  D(9m) 


(3.44) 


=  m(D9  ■  Fy  -  ydivfl  -  0trF)  =  / 
and  /  €  L*>(Zd),\/p,  1  <  p  <  oo. 

From  results  on  the  Dirichlet  problem,  it  follows  that  mff  belongs  to 
Wl,p(£*),  Vp,  1  <  p  <  oo.  In  particular,  mO  is  continuous.  Therefore,  we 
deduce  that 

mv(y)  >  A*  >  0,Vy  £  if,  compact  (3.45) 

where  the  constant  A*  does  not  depend  on  v(-). 

Remark  1.2.  The  assumption  (3.23)  requires  D  nonempty.  Otherwise 
(3.23)  and  (3.40)  yield  /  mdy  —  0,  which  is  impossible. 

We  also  shall  consider  the  following  approximation  to  m.  Let  Br  be  the 
ball  of  radius  R ,  centered  at  0.  Let  us  consider  mR  defined  by 

A'mj  +  div(mfty‘')  +  Am*  =  ArRm  (3.46) 

mR\aB„  =  o 

mR  €  WZ'P{BR) 

in  which  A  is  sufficiently  large  so  that 

ici*  -  e  •  9* + (a + ^trf)*2  >  o(ier + o2) 

VZ€Zd,9eZ 

Moreover,  r*(y)  =  r(y/R)  where  r(y)  is  smooth  r(y)  =  0  for  |y|  >  1, 
r(y)  =  1,  for  [y|  <  -  and  0  <  r  <  1. 

m 

We  have 

Lemma  1.3.  m*  the  extension  of  mR  by  0  outside  Br,  converges  to  m 
in  H1(Zi)  strongly  and  mq  =  mj,  converges  monotonically  increasing  to  m. 


HamUton-Jacobi-Bellman  equation  of  ergodie  control 
We  consider  the  following  problem:  Find  a  pair  x>  <t>  such  that 


with  4>!  bounded  at  oo 


A<f>+x  =  H{y,  D<f> ) 


(3.47) 


(3.48) 


Our  objective  is  to  prove  the  following 

Theorem  1.1.  We  assume  (S.16),  (8.17),  (8.88).  Then  there  is  one  and 
only  one  <f>  (up  to  an  additive  constant)  and  a  scalar  x  such  that  (8.4  7), 
(8.48)  hold. 

We  begin  with  some  preliminary  steps.  Let  us  consider  a  feedback  vQ(-) 
such  that  (c.f.,  (3.21))  we  may  write 


A<)>a  +  a<j>a  =  l(y,  va )  +  D<f>a  ■  g(y,  va). 


(3.49) 


Then  let  ma  be  the  invariant  probability  corresponding  to  the  feedback  va 
in  equation  (3.40).  We  then  have 

Lemma  1.4.  The  following  relation  holds 


J ( ~  l{y,va))mady  =  0 


Lemma  1.5.  We  have 


!*„<») - /fi MiKMI  <  {  £ 


(3.50) 


(3.51) 


where  the  constant  does  not  depend  on  a,  nor  y. 

Proof  of  Theorem  1.1. 

Existence 

Let  us  set  tj>a  =  <f>a  -/Al  <f>a{'ri)xa.{dir).  Then  \\<t>a/'l’\\ £,«  <  C.  Moreover, 
from  (3.21)  we  also  have 


A~4>a  +  a4>a  +Xa  =  H(y,  D~4>a), 


(3.52) 


in  which 


It  readily  follows  from  (3.52)  that 


m 


y-y-y 

•i-.’CvS 


^  bounded  in  W2’p,,‘(,£<1),  2  <  p  <  oo,  /i  >  0. 
We  can  extract  a  subsequence  such  that 
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We  can  assert  that 


4>a-*<f>  in  W2>P-'‘(je<<)  weakly. 


<j>a,D4>a  —*  <f>,D<f>  pointwise, 


hence, 

H(y,  D<f>a )  — *•  /J(y,  Zty)  pointwise, 

Noting  that  /f(y,  D<f>a)  is  bounded  in  LP,M,  we  can  pass  to  the  limit  in  (3.52), 
and  the  pair  <f>,\  satisfies  (3.47),  (3.48). 

See  [3]  for  the  proof  of  uniqueness. 

3.1.3  Singular  perturbations  with  diffusions  in  the  whole 
space 


Setting  of  the  problem 
We  consider 

/(x,  y,  v) :  x  Rd  x  U  —  Rn  (3.53) 

y(z,y,v):£n  x  RdxU  ->  Zd 
l{x,y,v):k''xRdxU-*R 
continuous  and  bounded 

Vnd  compact  c  U  (a  metric  space).  (3.54) 

On  a  convenient  set  (0,  A,  F{,  P*)  we  define  a  dynamic  system,  composed 
of  a  slow  and  a  fast  system  described  by  the  equations  (3.1),  with  g  replaced 
by  Fy+g(x,  y,  v).  The  cost  function  is  defined  by  (3.2),  and  we  are  interested 
in  the  behavior  of  the  value  function  ue(z,  y).  It  is  given  as  the  solution  of 
the  Hamilton  Jacobi  Bellman  equation  (noting  Ay  =  -Av  -  Fy  ■  D) 
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=  H{x,Dzut,y,-Dvut) 
ut  =0,  Vx  e  T,  Vy 
ut  €  W2-p'"(0  x  kd),  2  <  p  <  oo 

By  H^2,p,/4(0  x  £d)  we  mean  in  fact,  (since  O  is  bounded)  the  set  of  functions 
2  such  that  z/JM(y)  belongs  to  W2,p(0  x  £“*). 

We  shall  denote  by  ve(x,  y)  the  optimal  feedback.  The  assumption  (3.23) 
is  replaced  by 

Aij>  -  g(x ,  y,  v)  •  Dt/>  >  1,  Vz,  v,y£  ^J-D  (3.56) 

and  the  requirement  that  D ,i/>  have  the  same  properties  as  in  (3.23). 

Approximation  to  the  invariant  measure 

We  shall  consider  the  following  invariant  measures.  For  a  feedback  v(y), 
consider  mv  (z,  y)  which  is  the  solution  of 


A*m  +  divv(myt’)  =  0 

>  0,  f  m(x,  y)dy  =  1,  m  e  Vz. 

JR* 


(3.57) 


For  a  feedback  v(x,  y)  we  shall  consider  (z,  y)  which  is  the  solution  of 

-  cAzmt  +  A*m(  +  divy(myu)  =  0  (3.58) 

^|r  =  0,meeH1(Ox  Zd) 

m,  >  0,  /  me(r,y)dy  =  1,  Vz. 

JR* 

In  particular,  we  shall  call  m(  the  solution  of  (3.58)  corresponding  to  the 
feedback  v<(z,  y)  as  defined  in  the  preceding  paragraph.  The  construction 
of  the  invariant  probability  m(  is  done  in  a  way  similar  to  that  of  m.  Let 
us  consider  D,Di  as  in  (3.23).  To  avoid  confusion  in  the  notation,  let  us 
call  T,  Ti  the  respective  boundaries  of  D,Di  (instead  of  r,  Tj,  since  now  T 
denotes  the  boundary  of  O).  We  consider  the  stochastic  processes 

dx  =  y/edw  -  xt[x,  t)vd£,  x(0)  =  z 

dy  -  Fy  +  g{x,  y,  v(x,  y))dt  +  y/2 db„(t),  y( 0)  =  y 


.  A  %  * 


which  «e  defined  on  a  system  (fl,  A,  F*,  P*,w)  and  ti),  6  are 

independent 

* 

standard  Wiener  processes. 

,  m 

f. 

We  define 

e\x,y,(l)  =  inf{t|y(t)  €  D} 

>. 

> 

d(x,y;n)  =  inf{t>d'|y(t)^Di} 

and  we  have  (c.f.  (3.30))* 

§ 

E0\x,y)<4,(  y). 

*v 

Define  the  sequence  of  stopping  times  ro  =  0,  rn,  r  n+\  as  in  section  1.2, 

.X 

^ 

and  the  Markov  chain  Xn  =  x(r„),Yn  =  y(rn)  which  is  a  Markov  chain  on 
O  x  IV  We  then  define  the  linear  operator  on  Borel  bounded  functions  on 

> 

'*  t. 

O  x  Ti  by  the  relation 

PV(*,y)  =  £JV(*(*),y(*)). 

(3.59) 

4 

We  deduce  the  analytic  formula  (c.f.  (3.36)) 

■ 

PV(*,y)  =  *?e(x,  y) 

(3.60) 

>’« 

where 

•V 

-  cA,»j  +  Ayri  -  gv  •  DJJ  =  0, 

(3.61) 

1 

on  O  x  (£*  -  D) 

> 
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Q>|  Q, 
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£ 

-  eA*f  +  -  gu  •  Dv$  =  0  on  O  x  Di 

(3.62) 

flr«  =  |^lr  =  0 

✓ 

The  ergodicity  of  P*  is  proved  like  that  of  P  (c.f.  Lemma  1.1).  Let 
x((dx,da)  be  the  corresponding  invariant  probability  on  O  x  Ti.  We  then 

/ 

define  the  probability  ^{dx,  dy)  on  O  x  %d  by  the  formula 

;3 

Jo  A(*,  y)W(x,  y) 

(3.63) 

_  Jo  /r,  /o,(f’n)  A(x(t),  y(t))dt]x‘(df,  dr,) 

A 

k 

*Here  E  =  for  short. 
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for  any  A  Borel  bounded  on  O  x  £  .  Let  us  note  that  we  can  also  give  an 
analytic  formula  for  the  quantity 


namely 


We  have 


a£(z,  y)  =  E?  f  (  )  A(z(t),  y(t))dt 

J  0 

-  eAxdt  +  Ava  -  gv  •  Dva  =  A 
in  O  x  (J ld  -  D) 

a\r=P,^\r  =  0 

-eAx/3  +  Ayp  -  gv  •  Dy/3  =  A  in  Di 

^|rI=°,|^|r  =  0. 


d/i*(z,  y)  =  me(z,  y)dxdy. 
Moreover,  considering  the  Cauchy  problem 
dz 

~-eAxz  +  AyZ-g'>-Dvz  =  0 


(3.64) 


(3.65) 


(3.66) 


— |r  =  0,  z(z,  y,  0)  =  A(z,  y) 


we  have 


/  /  A(z,  y)m'(z,  y)dxdy  =  /  /  m'(z,y)z(z,y,t)dzdy, 
JO  JR. *  JO  JR* 


Vt  >  0 


and  we  deduce  from  this 


0  <  A*  <  mt(z,y)  <  Ai,  (3.67) 

Vz  €  0,Vy  €  K,  compact  of  £* 

with  constants  uniform  with  respect  to  v(>),  the  left  constant  (but  not  the 
right)  depending  on  the  compact  K. 

To  proceed  we  shall  slightly  reinforce  the  assumption  (3.66)  as  follows 


A*  -  k0\Dr!>\  >  l,Vy€  £d-D 


(3.68) 


•  sr  .  ^  r^, 


and  D,  t/>  have  the  same  properties  as  in  (3.23).  In  (3.68)  Jfco  is  a  constant 
such  that 

Jff(*»  y»  «)|  <  *o  (3.69) 

Note  that  (3.68)  is  satisfied  in  the  example  (3.27). 

Lemma  2.1.  Let  Bf  be  the  ball  of  radius  p  in  Zd,  and  Bf  =  Zd  -  Bf. 
Then 

f  L  mt{x,y)dxdy  <  A(p)  (3.70) 

JO  Jb, 

where  A (p)  — ►  0  as  p  — *  oo. 

Consider  also  as  in  (3.46)  the  solution  mcR  of 

-  eAzmtJt  +  Ajme*  +  divy{mtRgv) 

+A.m{R  =  hrRmtR 

—7^\v,mtR\aBn  =  0 

then  we  have 

mtR  — *  m,  in  Ll  C  Hl  as  R  -*  oo. 


(3.71) 


J.V.V 


(3.72) 


A  priori  estimate 

We  shall  need  the  approximation  of  ue  given  by 

-  A,U fR  -  J  AyUtR  +  &UtR 


(3.73) 


and 


—  DxufR,  y,  Dy\itR) 

ue  =  0  on  3(0  x  Br) 

utR  — *  u«  in  weakly  and  in  L°°  weak  star 


(3.74) 

where  loc  is  meant  only  for  the  y  variable.  We  shall  need  also  a  similar 
approximation  in  the  case  of  explicit  feedbacks;  in  particular  v( 

Lemma  2.2.  The  following  estimates  hold 


—  C>  |u‘|l»  <  C 

!•« 
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Lemma  2.3.  Let  4>(x)  e  (O)  C  H2(0),  then  we  have  the  inequality 
f  J  m«|Dx(ut  -  4>)\2dxdy  (3.76) 

J  J  m^DyUt^dxdy 

+  J J  Pmt( ue  -  <f>)2dxdy  <  J  J  miU((A4>  -  [3<t>)dxdy 
+  J  (\Dt<f>\2  +  M2)dx 

Convergence 

Lemma  2.4.  Let  ue  consider  a  subsequence  of  ut  such  that 

ut  -*  u  in  ff&JO  X  Zd)  weakly.  (3.77) 

Then  u  is  a  function  of  x  only,  belongs  to  Hq(0),  and  the  convergence  (8. 77) 
is  strong. 

We  now  identify  the  limit.  Let  us  recall  the  definition  of  m"  given  in 
(3.57).  Define  x(i»p)  by  the  formula 


X(*,p)  =  inf  f  mv(*’v)(l(x,  y,  v(y)) 


(3.78) 


.  *(■) J**  "  '  ' 

+P’  f{x,ytv(y)))dy 
and  consider  the  Dirichlet  problem 

-  Au  +  /3u  =  \{x,  Du),  (3.79) 

u|r  =  0,u  e  W2,p(0) 

We  can  then  state  the  following 

Theorem  2.1.  We  assume  (8.58),  (8.54)  ond  (8.68).  Then  we  have 

ut  — *  u  in  H2oc(0  X  Zd)  strongly  (3.80) 

See  [3]  for  details  of  the  proof. 


Interpretation  of  the  limit  problem 

The  limit  problem  is  written  as 

-  Au  +  /3u  =  inf{7 (x,  *(•))  (3.81) 

«( ) 

+I?u- /(x,  «(•))}  u|r  =  0 

where  we  have  set 

l(x,  v(-))  =  Jy  mv  ( x ,  y)l (x,  y,  v{y ))dy  (3.82) 

/(x,  v(-))  =  Jy  m"(x,  y)/(x,  y,  t/(y))dy. 

It  is  clear  that  (3.81)  is  a  Hamilton  Jacobi  Bellman  equation  for  a  slow 
system  whose  drift  if  /,  and  integral  cost  is  /.  For  this  problem  the  set  of 
controls  is  the  set  of  Borel  functions  v(y)  with  values  in  A  feedback 
on  the  slow  system  is  thus  still  a  function  v(x,y).  There  exists  an  optimal 
feedback  for  the  limit  problem,  namely  v(x,  y)  obtained  in  (3.6).  Indeed 
consider  the  function  V  defined  in  (3.5),  then 

y)  =  V  (*>  y>  Dv4>) 

is  an  optimal  feedback  for  the  limit  problem.  In  fact,  this  is  the  feedback  to 
be  applied  on  the  real  system  as  a  surrogate  for  ve(x,  y)  defined  in  (3.58). 
One  can  show  by  techniques  similar  to  those  used  in  previous  paragraphs  to 
obtain  Theorem  2.1,  that  the  corresponding  cost  function  will  converge  as  e 
tends  0  to  « in  Hl(0  x  Y).  Note  that  unlike  the  deterministic  situation  the 
optimal  feedback  for  the  limit  problem  is  not  a  function  of  x  only.  In  fact 
(3.83)  corresponds  to  the  composite  feedback  of  Chow-Kokotovic  [7]  (c.f. 
also  [8]  in  the  deterministic  case). 

3.2  Optimal  Stochastic  Scheduling  of  Systems  with 
Poisson  Noises 

In  this  section  we  consider  the  problem  of  optimal  stochastic  scheduling 
for  nonlinear  systems  with  Poisson  noise  disturbances  and  a  performance 
index  including  both  operating  costs  and  costs  for  scheduling  changes.  In 
general,  the  value  functions  of  the  dynamic  programming,  quasi-variational 
inequalities  which  define  the  optimality  conditions  for  such  problems  are 


not  differentiable.  However,  we  can  treat  them  as  “viscosity  solutions”  as 
introduced  by  Crandall  and  Lions.  Existence  and  uniqueness  questions  are 
studied  from  this  point  of  view. 

3.2.1  Introduction 

Optimal  scheduling  problems  arise  in  many  contexts,  including  inventory 
control  systems  and  resource  allocation  problems  in  military  systems  plan¬ 
ning.  These  problems  typically  involve  stochastic  dynamical  systems,  ad¬ 
mitting  discrete  state  transitions  at  random  times  as  control  actions,  and 
incurring  both  switching  costs  and  continuous  running  costs.  Using  the 
dynamic  programming  principle,  one  can  show  that  the  optimality  condi¬ 
tions  for  these  problems  are  expressed  mathematically  by  quasi-  variational 
inequalities  (QVI).  It  is  difficult  to  treat  QVI’s  explicitly,  and  most  of  the 
work  has  focussed  on  proving  existence,  uniqueness,  and  regularity  of  solu¬ 
tions. 

In  our  case,  the  state  system  is  forced  by  Poisson  noises.  Since  the 
infinitesimal  generator  of  the  state  process  is  first  order  and  has  a  translation 
in  the  argument,  the  associated  QVI  is  first  order  and  fully  nonlinear;  and 
so,  the  standard  existence  and  uniqueness  theory  developed  for  diffusion  - 
parabolic  systems  does  not  apply.  To  treat  the  problem,  we  use  the  method 
of  viscosity  solutions  introduced  by  M.  G.  Crandall  and  P.  L.  Lions  [9]. 
Various  properties  of  viscosity  solutions  are  developed  in  Crandall  -  Evans  - 
Lions  [11].  We  use  the  approach  in  Capuzzo  Dolcetta-  Evans  [12]  developed 
for  deterministic  systems.3 

We  prove  that  the  value  function  u  associated  with  the  optimization 
problem  is  a  viscosity  solution  of  the  corresponding  (QVI).  Existence  of 
solutions  to  the  (QVI)  is  shown  by  using  a  discrete  approximation  to  an 
associated  penalized  system  and  then  using  results  for  accretive  operators 
as  in  [15].  On  the  other  hand,  we  use  dynamic  programming  to  obtain  a 
decreasing  sequence  of  value  functions  u*  optimal  for  controls  with  at  most 
/  switches,  which  converges  uniformly.  This  approach  was  used  to  obtain 
a  maximum  solution  of  certain  (QVI’s)  in  Menaldi  [10-11]  without  nonde¬ 
generacy  assumptions.  In  Blankenship  -  Menaldi  [20],  related  problems  were 
treated  involving  the  application  of  (QVI)  to  power  generation  systems  with 
scheduling  delays.  See  also  [21]  [22]  for  a  survey  of  viscosity  methods  for  the 


control  of  diffusions. 

The  optimal  stochastic  control  of  linear  regulator  systems  with  Poisson 
noise  disturbances  is  considered  in  [23};  stochastic  stability  properties  of 
linear  systems  with  multiplicative  Poisson  noises  are  derived  in  [25].  See 
also  [26]. 

Problem  Statement 

Let  (n,F,P)  be  a  probability  space  and  F,,t  >  0  a  non-decreasing,  right- 
continuous  family  of  completed  sub  <r-fields  of  F  such  that  Ft  |  Fm,  :=  F,t> 
0.  Consider  the  general  nonlinear  dynamical  system 


[  dyx(t)  = 

1  y.(o)  = 


=  9(y*(0>  <*{t))dt  +  h(yx{t),  a(t))dNa(t)(t) 


(3.84) 


where  JVj(t),  i  =  1, . . . ,  m,  are  independent  Poisson  processes  with  intensities 
A<,  i  =  l,...,m.  a(t)  is  a  right  continuous,  piecewise  constant  random 
function  with  finite  range  1, . . . ,  m,  and  is  measurable  with  respect  to  Ft,  t  > 
0.  Actually,  a  is  an  admissible  control  consisting  of  random  switching  times 
$i  and  random  switching  decisions  d,  such  that  0,  are  adapted  to  Ft  and  d, 
are  Ft>  -  measurable  so  that 


0  —  60  <  6i  <  . . .  <  9i~\  <  0,'  <  0,-|- 1 , 0,  — ►  +00  a.s. 


di  €  {l,...,m),di  7^  di-i  if  0j  <  oo  (3.85) 

And  so 

o(t)  :=  di  if  0,-  <  t  <  0,+i,»  >  0 
is  indeed  Ft  •  measurable. 

Let  the  set  of  all  admissible  controls  with  initial  setting  d  be 

A *  :=  {a|a  =  {0,-,  dj}  satisfies  the  “above”  properties  (3.86) 

with  initial  setting  do  =  d}. 

We  take  the  performance  index  to  be 

Jt(  a)  :=  EXti{  /°0/(y,(t),a(t))e-^dt  +  f;ifc(d.-i,di)e-^} 

Jo  i=i 

=  EzAJL l /*'  /(*(«),  di-x)'-fi'dt  +  *(4,-i.  diJe-^-l)  (3.87) 


A  .V 
A,  .'«■ 


A  * 

A 

-• 


where  >  0  is  a  discount  factor  and  k(d,d)  is  the  cost  of  switching  from  d 
to  d  such  that4 

*(d,d)  >0ifd^d;ifc(d,d)  =  0  (3.88) 

*(d,d)  <  i(d,d)  +  *(d,d)  ifdjtd^d. 

Without  loss  of  generality,  we  can  define  Jfco  *•=  minfc(d,cQ,  d  d.  We 
assume  /  >  0,  g  and  h  are  bounded  and  Lipschitz  continuous 

l?(M)l  <  II?!!  <  oo 

\q(x,d)~  q(x,d)\<  L\x- x\  (3.89) 

with  q  =  /,  g  and  h ,  for  all  at,  x  6  £n,  d  G  1, . . .,  m. 

Under  these  assumptions,  (1.1)  has  a  unique  solution.  Defining  the  value 
function 


ud(x)  :=  inf  Jd(oi),x  €  Hn,d  6  (1, . .  .,m} 
aeA * 


(3.90) 


we  want  to  design  an  optimal  control  a*  such  that 

ud(x)  =  Jd(a*)  =  inf  Jd (a).  (3.91) 

aeAd 

Remark.  Na^(t)  is  an  inhomogeneous  Poisson  process  with  intensity 
function  Aa((). 

Summary  of  Results. 

In  subsection  2.2  we  show  that  the  optimal  value  function  urf(i)  in  (3.21) 
maybe  defined  as  the  limit  of  the  value  functions  ud(x)  of  systems  with  a 
finite  number  t  of  switches  as  l  — »  oo  (Theorem  2.3).  We  show  that  the 
convergence  is  uniform  (Theorem  2.5);  and  we  derive  two  representations  of 
u4(i)  as  the  optimal  value  function  (Theorems  2.6  and  2.7).  We  describe 
the  associated  optimal  (control)  switching  policy  (Theorem  2.8),  and  we  use 
it  to  obtain  an  additional  estimate  on  the  convergence  of  ttd  to  ud. 

In  subsection  2.3  we  derive  the  QVI  which  must  be  satisfied  by  the 
optimal  value  function  (equation  (3.113)).  We  show  that  the  optimal  value 
function  is  a  viscosity  solution  of  the  QVI  (Theorem  3.1).  Then  we  show 
that  the  solution  is  unique. 

In  subsection  2.4  we  prove  that  the  QVI  has  a  viscosity  solution  by  con¬ 
structing  a  sequence  of  solutions  to  a  penalized  system  (equation  (3.119)) 

‘The  cam  when  the  switching  coets  can  be  sero  is  treated  in  subsection  2.5. 
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and  proving  that  these  solutions  are  uniformly  bounded  and  uniformly  Holder 
continuous  (Theorem  4.4).  We  show  that  the  limit  of  the  sequence  of  solu¬ 
tions  to  the  penalized  system  is  a  viscosity  solution  of  the  QVI. 

In  subsection  2.5  we  consider  the  case  when  the  switching  costs  vanish 
(J fc(d,  d)  =  0  for  d  ^  d  in  (3.19).  In  this  case  the  optimal  value  function 
u  is  independent  of  the  initial  control  configuration  d  (since  we  can  switch 
without  cost  at  any  time),  and  it  (formally)  satisfies  a  Hamilton  -  Jacobi  - 
Bellman  equation  which  is  fully  nonlinear  in  Vu.  The  method  of  viscosity 
solutions  is  required  to  treat  this  case.  We  show  that  the  optimal  value 
function  corresponding  to  non-zero  switching  costs  will  converge  to  u  as  the 
switching  costs  tend  to  zero,  and  that  u  is  the  unique  viscosity  solution 
of  the  Hamilton  -  Jacobi  -  Bellman  equation.  The  result  is  analogous  to 
those  in  Capuzzo  Dolcetta  -  Evans  [12].  Thus,  the  method  of  viscosity 
solutions  provides  a  complete  framework  for  the  treatment  of  the  optimal 
control  problem  (3.16)  -  (3.23)  over  the  full  range  of  parameter  values  and 
operating  regimes. 

3.2.2  Dynamic  Programming  and  Preliminary  Results. 

Before  using  dynamic  programming  to  investigate  the  properties  of  the  value 
function  ud(z),  we  need  some  preliminary  results. 

Lemma  2.1.  For  any  stopping  time  r  which  is  adapted  to  Ft  and  any 
measurable  bounded  function  q,  we  have 

£[?(y*(*  +  t))\Ft\  =  EVt{T)q{yv,{r)[t)).  (3.92) 

Proof  See  [24]  for  the  proof  of  this  and  all  other  lemmas  and  theorems 
in  this  section. 

Lemma  2.2.  (i)  For  each  d€{l,...,m)  andzeZn, 

ud(z)  <  min{u^(z)  +  k(d,  cl)}  (3.93) 

3*4 

(ii)  For  any  stopping  time  9  >  0, 

ud(x)  <  E{J^  f(y,(a),d)e~Ptds  +  ud(yt(O))e~0t} 

Notation.  For  z  €  £n,  d  e  1, . . . ,  m, 


(3.94) 


Now,  we  want  to  use  the  dynamic  programming  principle  to  show  there 
exists  a  convergent  sequence  of  optimal  solutions  of  the  problem  with 
respect  to  controls  which  have  at  most  l  switches. 

For  each  x  £  J?n,  d  €  1, . . . ,  m,  let 

«$(*)  :=  r  /(y,(«).  d)e~0,da.  (3.96) 

J  0 

Notation.  If  u,v  €  C(£,l)m,  then  we  say  u  >  v  if  ud  >  vd,  Vd  = 
Define  an  operator  Tj  :  C(J?n)m  — »  C(£n)  by 

rdu(i)  :=  inf  E  |  J*  f(yt(a)td)e~0tda  +  e-^Md[u](yi(d))  J  .  (3.97) 

Here  we  understand  the  infimvm  is  taken  for  all  stopping  times  0  >  0 
adapted  to  Ft.  If  u  >  v,  then  for  each  e  >  0,  there  exists  a  stopping 
time  0t  >  0  and  df  Ft,  -  measurable  such  that 

r>(z)  >  E  f{y,{a),d)t~p,da  +  e~^*[ud(y,(ffe))  +  *(d,dt)]  j  -  e 

>  E  j  /(y»(«), d)a~$,da  +  *_^#,[t7d(yx(0e))  +  *(<*,  de)l |  ~  « 

>  rdv(x)  -  e. 

Let  e  1  0,  we  have  Tju  >  I^v.  Let  0  <  q  <  1,  then 

rd[(l  -  r,)u  +  T)v] 

=  inf  £7  j  jf  f(ys{a)td)e~fitda  +  e~p,Md [(1  -  f?)u  +  i?w](y«(0))| 

>  jnftfjjf  f[y,[a),d)e~0,da  +  e-pt{{l  -  q)Md[u](yx(0)) 

+^Md[v](y«(fl))>} 

>  (I  -  J?)I\ju(z)  +  nTdv{x). 

Thus,  I^  is  a  non-decreasing,  concave  function. 


Suppose  we  are  given  ue-i.  We  can  define 

u${x)  :=  rdur_i(i). 


(3.98) 


Since  uf(x)  =  rjuo(x)  <  u$(x),  then  by  the  non-decreasing  property  of  Tj, 
we  have  =  rjuo(i)  <  r,juo(i),  and  so 


(3.99) 


0  <  ud(x)  <  «/_!(*)  <  . . .  <  ug(z)  < 


Thus,  uf(x)  converges.  We  can  define 


«4>(*)  :=  «<(*)• 

€—♦00 


(3.100) 


Theorem  2.3. 


Uf(x)  =  inf{Jz(at)lae  €  Ad  has  at  most  t  switches  }  (3.101) 


and  thus 


<4,M  =  «*(*)  :=  inf^(a). 

a€A • 


Lemma  2.4.  For  each  0  <  7  <  min  j  l,  > 


!«<(*)-«*(*) |  <  CJ,|*  -  *!*» 


for  all  1  <  l  <  00  and  x,  x  €  Zn  with 


_  H/iruT 

1  /J  -  7L(1  + Ab„) 


(3.102) 


(3.103) 


(3.104) 


where 


Amax  —  max{Aj, . . . ,  Am}. 

If  L(  1  +  Amax),  then  7  can  be  taken  to  be  1. 

Remark.  Since  Ni  has  independent  increments,  then  F,  is  independent 
of  any  sub  9-field  generated  by  Ni(t)  —  iV,(a),  a  <  t,i  =  1, ...» m,  so  that  for 
t  >  a, 

E[\y‘{t)  -  y|(t)|  |F.j  <  jy*(a)  -  y|(«)|e^— ><*->. 

Thus, 

lud(y*(a))  -  t*-(y|(a))|  <  C7|y«(s)  -  y|(a)P  a.s. 

Remark.  If  fco  >  ||/|(//9  ,  then  uo(z)  is  the  optimal  solution,  i.e.,  no 
switching  occurs. 


>:>:> 


We  can  obtain  the  following  estimate  by  the  method  in  [18]  [19]. 
Theorem  2.5.  I/O  <  ko  <  ||/||//J,  then 


(3.105) 


Thru ,  U(  |  Uqo  uniformly. 
Theorem  2.6. 


«£(*)  =  inf  E{  ['  f(yx(s),d)e~P'ds  +  (3.106) 

Jo 


Theorem  2.7.  //  3xq  such  that  ud(xo)  <  Md[u](xo),  then  Oi  >  0  a.s 


4(*o)  =  E{  f  f(yX0(s),d)e-^ds  +  u d(yZo{0))e~^}  (3.107) 


for  allO  <0  <  0 \ . 

Now,  suppose  we  have  a  Holder  continuous  function  ud  satisfying  (3.21). 
We  can  define  an  optimal  policy  o*  =  6  Ad  as  follows. 


0O  —  0,do  —  d, 


If  we  are  given  then  set 


0i  :=  inf{  stopping  time  0  >  tf,_i|ud<-,(y*(ff))  =  Mdi~l  [u](yI(0))a.s.} 

(3.108) 

If  0i  <  oo,  set 


di  =  any  F#t.  -  measurable  random  variable  d  €  (1, . . .,  m},  d  #  d,_i 


such  that 


Af*-1  [«j(y*(tf*))  =  w3(y*(^»'))  +  *(cf, 5)  a.s. 


(3.109) 


y,(t)  controlled  by  decision  when  J,_i  <  t  <  0,-. 


Theorem  2.8.  The  control  policy  a*  defined  by  (3.108  and  (3.109)  is 
optimal,  i.e.,  uJ(x)  =  «/*(<**)  =  mina€A*  J^(a).  In  addition ,  0<  — >  oo  a.s. 


as  i  — ►  oo. 


Corollary  2.9.  We  have  the  additional  estimate 


ll«* -«£>!!< 


pko[i+iy 


(3.110) 


**v*V*‘«^ .  *V*‘.  *’ .  1 
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3.2.3  Viscosity  Solutions  of  the  Quasi- Variational  Inequal¬ 
ity  (QVI). 

We  want  to  derive  necessary  and  sufficient  conditions  for  the  optimal  solution 
ud(x),  x  €  i d€  1, . . . ,  m.  Assume  for  the  moment  that  the  value  functions 
belong  to  Cl(%n).  Then  by  the  necessary  condition  in  Lemma 
2.2,  we  have 

g|-^-p(yg(-  ))|  <  f*  /(y*(s),d)e^«ds 

+  f— -1  u<4(y*W)|  (3.H1) 

and  so,  we  obtain  a  differential  form  as  1 1  0, 


-g(z,d)-  Vud(z)  -  Ad[ud(x-f  h(x,d))  -  ud(x)]  <  f{x,d)  -  /?ud(x)  (3.112) 

Vi  e  Zn  and  d  6  1, . . .,  m.  Combining  (3.93),  (3.107)  and  (3.112),  we  obtain 
a  quasi-variational  inequality  (QVI) 

max{/Jud  -  gd  ■  Vud  -  Ad[ud(.  +  hd)  -  ud]  -  fd,  ud  -  Md[u]}  =  0  (3.113) 
on  Zn,  where 

/“(•)  “  f{;d),gd( •)  :=  g(;d),hd(-)  :=  h(-,d).  (3.114) 

Note  that  (3.113)  is  a  fully  nonlinear  first  order  partial  differential  equation 
which  does  not  admit  a  differentiable  solution  in  general.  But,  we  can  treat 
it  using  the  method  of  viscosity  solutions,  which  was  introduced  by  M.  G. 
Crandall  and  P.  L.  Lions  [9],  and  which  was  used  for  deterministic  switching 
problems  by  I.  Capuzzo  Dolcetta  and  L.  C.  Evans  [12]. 

We  denote  by  BUC(Zn)m,  the  space  of  bounded,  uniformly  continuous 
Zm -valued  functions  on  Zn. 

Definition.  A  function  u  =  (ud, ...,um)  6  C(£n)m  is  said  to  be  a 
viscosity  solution  of  the  (QVI)  if  for  each  d  €  (1, ...,m}  and  each  <f>  € 
C1(Zn)  such  that 

(i)  if  ud  -  ^  attains  a  local  maximum  at  io  €  Zn,  then 

max{0ud(zo)  ~gd(x0)-  V^(i0)  -  Ad[ud(x0-t-hd(x0))  -  ud(x0)]  -  /d(x0), 
ud(x0)  -  Md[u](xo)}  <  0  (3.115) 


.  ;  -•  -*  *  ■*  ••  • 


(ii)  if  ud  -  <f>  attains  a  local  minimum  at  zq  €  Zn,  then 

max{/9ud(*o)  -  gd(zo)  •  V^(z0)  -  Xd{ud{zo  +  ^(*0))  -  tid(x0)]  -  /*(*<)), 

ud{zo)  -  Md[u\(z0)}  >  0.  (3.116) 

Theorem  3.1.  Under  the  previous  assumptions,  the  value  function  u  = 
(u1, .  .  .,um)  with 

«*(*):=  >nf  Jd(a) 

a€A 4 

is  a  viscosity  solution  of  the  (QVI)  (3.113). 

Before  discussion  the  existence  of  a  solution  to  the  (QVI),  we  consider 
the  conditions  under  which  (3.113)  admits  a  unique  solution,  so  that  any 
functions  constructed  to  satisfy  (3.113)  must  be  the  optimal  solution. 

Lemma  3.2.  If  u  =  (u1, . . . ,  um)  is  any  viscosity  solution  of  (3.113), 
then 

ud(x)  <  Md(u](2),Vier,d6{l,...,m}.  (3.117) 

Theorem  3.3.  If  u  =  (ul,...,um)  and  v  =  (v1, ...,vm)  are  viscosity 
solutions  of  (3.113).  Then  u  —  v. 

3.2.4  Existence  of  Viscosity  Solutions. 

Now,  we  use  a  finite  difference  approximation  to  construct  a  sequence  of 
solutions  which  converges  to  the  solution  of  (3.113). 

Let  p  €  C2( J2")  such  that 

p{x)  =  0,  x  <  0 

<  p(x)  >0,  x  >  0  (3.118) 

0  <  p\x)  <  1, p\x)  >0  for  x  >  0 

and  pf(x)  =  p(x/c),  x  €  c  >  0. 

Consider  the  penalized  system  for  approximation. 

*•?(*)  -  +  'Ax))  -  Ud{x)\  -  Xd[ud(x  +  hd{x))  -  «f(«)] 

+  EftK'W  -  «?(*)  -  *M)  =  /*(*)  (3.119) 

fad 

«;w  -  + ««'(*))  -  «?(*)]  -  f(. + *'w)  -  «?w] 


m 

»*>  v«v 
£ 


«  -■N.  .%  ' 


r/V, 


f 


v'v\ 
»  v  V  . 


1 


<»/.• 

>  j  •  •  ■ 


or 


+  -Q  £  *(«?(*)  -  «d(*)  -  *(d,d))  =  4/d(x)-  (3.120) 

P  d*d  P 

We  define  operators  A,  ITi,  II2:  C(Zn)m  — ♦  C(Zn)m  such  that  Au  = 
(A1u, . . . ,  Amu),  Iliu  =  ,  IIf*u)  and  II2U  =  (IIjU, . . . ,  ITj^u)  where 


Xdu{x)  :=  --^[ud(x  +  «sd(x))  -  ud(x)] 
n?u(x)  :=  -^[ud(x  +  hd(x))  -  ud(x)] 


(3.121) 


(3.122) 


Il|u(x)  :=  -  £  Pt{ud{z)  ~  «d(x)  -  *(d,d)).  (3.123) 

P  d}id 

Definition,  (i)  An  operator  S  :  X  — ►  X  with  domain  D(S)  is  said  to 
be  accretive  on  the  real  Banach  space  X  if 


|x  -  x  +  7[5(x)  -  S(x)]||  >  1 1 x  -  x| 


(3.124) 


for  all  x,  x  €  D(S),  V7  >  0. 

(ii)  An  operator  S  is  said  to  be  m  -  accretive  on  X  if  S  is  accretive  on  X 
and  the  range  R(I  +  nS)  =  X  for  all  7  >  0  (or  equivalently  for  some  7  >  0). 
The  following  lemma  is  from  Evans  [17]. 

Perturbation  Lemma  4.1.  If  S  is  m-accretive  on  X  =  C(£")m  and 
T  is  accretive,  Lipschitz  continuous  everywhere  defined  on  X,  then  (S+T)  is 
m-accretive  on  X,  in  particular,  the  range  RKI  +  S  +  T)  =  C( Jln)m. 
Lemma  4.2.  A  is  m-accretive  on  C(£n)m. 

Lemma  4.3.  Ill  and  IT*  are  accretive. 

By  the  Perturbation  Lemma  4.1,  A  +  II  is  m-accretive  and  so,  for  each 
e  >  0,  we  have  a  solution  u,  €  C(£n)m  of  (3.120). 

Theorem  4.4. 

(i)  0  <  u?(x)  <  ||/m  e  >  0,  d€  l,...,m. 

(ii)  For  each  0  <  7  <  min  l)> 

l«*(*)  ~  “*(*)l  <  C7|x-  xP,x€  Zn,e  >0,de{l . m} 

with  the  same  constant  C~  in  (2.20).  If  ft  >  L(1  +  Amax),  we  can  take 


3.2.5  The  Case  of  Vanishing  Switching  Costs. 

In  the  case  when  the  switching  costs  vanish,  k(d,  cl)  =  0  for  some  d  ^  dm 
(3.19),  then  the  dynamics  may  be  switched  at  any  time  without  incurring  a 
cost;  hence,  the  minimum  cost  does  not  depend  on  the  initial  control.  That 


u1  =  u2  =  . . .  =  um  :=  u  (3.125) 

If  we  follow  the  arguments  used  in  the  previous  sections,  we  can  show  that 
u  is  bounded  and  Holder  continuous  with  the  same  Holder  constant  C.,  used 
in  Lemma  2.4.  If  u  were  continuously  differentiable  on  Zn,  then  by  the 
principle  of  dynamic  programming,  u  would  be  (formally)  a  solution  of  the 
Hamiltonian  -  Jacobi  -  Bellman  equation 

max  {fin  -  gd  •  Vu  -  A<j[u(-  +  hd)  -  u]  -  fd}  =  0  (3.126) 

on  Zn.  However,  u  is  not  always  C1.  By  invoking  the  same  arguments  used 
in  subsection  4,  we  can  show  that  u  is  the  unique  viscosity  solution  of  (3.126) 
in  the  following  sense: 

Definition  5.1.  A  bounded  and  continuous  function  u  on  Zn  is  a  vis¬ 
cosity  solution  of  (3.126)  if  for  each  <j>  e  C1(J5n)  such  that 


-Ad[u(xo  +  ^(*0))  -  tt(*o)]  ~  /d(*o)}  <  0 


From  the  above  lemma,  ud  are  uniformly  bounded  and  uniformly  Holder 
continuous.  Then  by  the  Arzela-Ascoli  Theorem,  there  exists  a  subsequence 
ct  such  that  ndt  -*  ud  €  C(Zn)  for  all  d  €  l,...,m.  The  convergence  is 
uniform  on  each  compact  subset  of  Zn.  In  fact,  n  is  bounded  and  Holder 
continuous  with  the  same  Holder  exponent  7. 

Theorem  4.5.  u((  — »  u  locally  uniformly  in  Zn  and  u  solves  (3.113)  in 
the  viscosity  sense . 

Remark.  In  general,  u  is  only  Holder  continuous.  If  we  know  u  has 
some  regularity  properties,  say  u  exists  in  some  neighborhood,  then  one 
can  show  u  satisfies  (3.113)  in  the  ordinary  sense.  The  point  is  that  the 
derivative  of  u  is  not  continuous  across  characteristic  curves. 


>: 


(i)  if  u  -  attains  a  local  maximum  at  zo  €  Zn,  then 

max  {fiu{z0)  -  g{x0)d  •  Vu(x0) 

o— 


m 


(3.127) 


(3.128) 


(ii)  if  u  -  4>  attains  a  local  minimum  at  Xq  €  £n,  then 

max  {/3u(z0)  -  s(* o)d  •  Vu(zo) 

d= 

+  hd(z0))  -  u(z0)]  -  fd{2 o)>  >  o 


We  now  establish  that  the  optimality  system  is  closed;  that  is,  each  value 
function  corresponding  to  non-zero  switching  costs  will  converge  to  u  as  the 
switching  costs  tend  to  zero.  The  result  corresponds  to  a  similar  result  in 
Capuzzo  Dolcetta  -  Evans  [12]. 

Theorem  8.1.  Suppose  toe  have  a  set  of  switching  costs  {kt(d,  d)}  such 
that 

*t(d,d)  >0  Vd^de  {l,...,m}  (3.129) 

ke{d, k)  <  kt(d, d)  +  Jfce(d, d),d£d*d 

For  each  e  >  0  let  ue  =  (u*, . . .,  u™)  be  the  unique  viscosity  solution  of  the 
corresponding  Q VI  toith  switching  costs  {k([d,d}}  and  let  u  be  the  unique  vis¬ 
cosity  solution  of  (3.126).  If  kt (d, d)  — »  0  as  c  — ►  0  for  all  d,dG  {1, . .  ,,m}, 
then  tid  — ►  u  as  e  —»  0  for  all  d  €  {1, . . . ,  m}. 

Acknowledgment.  We  would  like  to  thank  Professor  L.  C.  Evans  for 
his  contributions  to  this  portion  of  our  work. 


3.3  Almost  Sure  Stability  of  Linear  Stochastic 
Systems  with  Poisson  Process  Coefficients 

In  this  section  we  consider  the  problem  of  determining  the  sample  path 
stability  of  a  class  of  linear  stochastic  differential  equations  with  point  pro¬ 
cess  coefficients.  Necessary  and  sufficient  conditions  are  obtained  which  are 
similar  in  spirit  to  those  derived  by  Khas’minskii  and  Pinsky  for  diffusion 
processes.  The  conditions  are  based  on  the  deep  theorems  of  Furstenburg  on 
the  asymptotic  behavior  of  products  of  random  matrices.  Estimates  on  the 
probabilities  of  large  deviations  for  stable  processes  are  also  given;  together 
with  a  result  on  the  stabilization  of  unstable  systems  by  feedback  controls. 

3.3.1  The  Problem  and  Main  Results. 


Consider  the  linear  stochastic  system 


dx{t)  =  Ax{t)dt  +  £  Bix(t)dNi[t),  (3.130) 

i=0 
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^--V  ir."J-’V".'W  A"-".'  r:  r.  *-.  t- 


z(0)  =  r0  e  £n  \  {0},  t  >  0, 

on  the  underlying  probability  space  (fl,  F,  P)  with  A  and  Bi  constant  n  x  n 
real  matrices,  and  (.N,(f), t  >  0},  *  =  1  independent  Poisson  pro¬ 

cesses  -  specifically,  one  dimensional  counting  process  with  intensity  A<  >  0 
and  right-continuous  paths.  Ni(t )  6  {0, 1, 2, ...}  counts  the  number  of  oc¬ 
currences  in  [0,  f].  We  are  interested  in  the  almost  sure  stability  properties 
of  the  solutions  of  (3.130).  That  is,  if  )  •  |  is  any  norm  on  Zn  (||  •  ||  is  the 
induced  matrix  norm),  we  would  like  to  characterize  the  asymptotic  expo¬ 
nential  growth  rate 

IS? 108  ('n1)  <3131> 

if  it  exists. 

This  problem  is  the  analog  of  the  one  considered  by  Khas’minshii  [27] 
and  Pinsky  [28]  for  diffusion  processes,  and  by  Loparo  and  Blankenship  [29] 
for  systems  with  jump  process  coefficients.  Like  previous  results,  the  expres¬ 
sion  given  here  for  the  growth  rate  is  not  an  explicit,  readily  computable 
one,  except  in  simple  cases.  The  stability  properties  of  the  moments  of  the 
solution  of  (3.130)  were  considered  by  Marcus  [30]  [31]  (see  also  [32]).  Ex¬ 
plicit  stability  criteria  are  possible  for  the  moments.  Related  results  on  the 
optimal  control  and  scheduling  of  systems  with  Poisson  noises  are  given  in 
[23]  [24].  See  also  [26]. 

The  system  (3.130)  is  interpreted  in  terms  of  the  integral  equation 

x (t)  =  xo  +  f  Ax{a)da  +  f  Bix(a)dNi(a)  (3.132) 

Jo  <=1  Jo 


with  the  stochastic  integral  defined  by  the  calculus  explained  in  [31]  [33]. 5 
Let  {r*fj  >  1}  be  the  interarrival  times  and  t*  =  r{  -)-•••  +  rj  be  the 
occurrence  time  for  the  Poisson  process  7V,(t).  Then 


fo  Bix(a)dNf(a)  =  j  ^N^t) 


B,x(tj-), 


Ni{t)  =  0 
Ni{t)  >  1. 


(3.133) 


Now,  let  {r,-,y  >  1}  be  the  interarrival  times  of  the  sum  process  N[t)  = 

Ni (*)  +  •••  +  Nm(t)  with  intensity  A  =  Ai  H - b  Am,  and  fij  be  the  process 

indicating  which  N%  under  went  an  increment  at  the  occurrence  time  tj  = 
rj  -b  •  •  •  +  Tj.  We  assume  the  probability  of  multiple,  simultaneous  jumps  is 

*We  could  also  treat  some  of  the  more  complicated  point  process  models  in  [31]  [33], 
but  the  main  ideas  are  best  conveyed  by  the  simple  case  considered  here. 
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zero.  The  process  z(f),  t  >  0  exists,  has  right  continuous  paths,  and  jumps 
at  tj,j  =  1, 2,  —  If  we  set  £),  =  /  +  B,-,  then 

x(t)  =  ■  •  •  DTie^x0.  (3.134) 

This  expression  is  the  basis  of  our  treatment  of  the  almost  sure  stability 
problem.  Its  composition  as  a  product  of  random  matrices  directed  our 
attention  to  the  work  of  Furstenberg  and  Kesten  [35],  Grenander  [36]  and 
Furstenberg  [38]- [41]  on  the  limits  of  products  of  random  matrices. 

Our  main  result  is  based  on  the  following  observations.  First,  for  each  t  = 
1, . . . ,  m,  the  {rj,j  >  1}  are  independent  and  exponentially  distributed  with 
parameter  A,-.  The  random  processes  {rj,pj,j  >  1}  depend  in  a  complex 
way  on  {rj,i  =  1  ,...,m,y  >  1}.  However,  {r,,i  >  1}  and  {pj,j  >  1}  are 
independent  and  form  independent,  identically  distributed  sequences.  This 
follows  from  the  presumed  independence  of  the  {JV,(t),i  =  l,...,m};  see 
[25].  As  a  consequence,  we  have  the  following: 

Theorem  (Stability).  Consider  the  system  (3.130)  with  the  stated 
assumptions  on  the  processes  lV<(t), »  =  1, . . . ,  m.  Then 

r  =  lim  i£log||A»u*e*,*...Bmuiei4ri||  <  oo  (3.135) 

k—*oo  K 

exists  and 

r=  lim  zlofi\\Dmukc^...Dmu1c^\\  a.s.  (3.136) 

The  quantity  r  is  the  asymptotic  exponential  growth  rate  of  the  process 
z(t);  that  is, 

r§“‘" 

Hence,  r  >  0  implies  almost  sure  instability  and  r  <  0  corresponds  to  almost 
sure  asymptotic  stability.  This  result  is  proved  in  section  3  of  [25]. 

It  is  possible  to  obtain  a  more  detailed  description  of  the  long  term 
behavior  of  (z(t),t  >  0}  by  examining  the  behavior  of  products  of  random 
matrices  acting  on  specific  initial  states  z(0)  =  0.  The  key  questions  are: 
Does  the  limit  of 

ilog||Dw^‘...DMle^z0|| 

exist?  If  it  does,  how  is  it  related  to  the  rate  r  in  (3.136)?  To  treat  these 
questions,  we  generalize  some  results  of  Furstenburg,  Kesten,  Grenander  and 
others  on  random  walks  on  semi-simple  Lie  groups  to  general  semi-groups 


(not  necessarily  groups  since  the  terms  Dk  may  be  singular).  This  analysis 
is  given  in  section  4  of  [25].  The  main  result  is  as  follows: 

Suppose  n  is  the  measure  on  the  Bore!  sets  B(Znxn)  defined  by 

M(r)  =  p{D^eAT'  €  r},r  e  B(znxn). 

Let  SG  be  the  closed  semi-group  generated  by  the  support  of  fi,  i.e. 

SG  =  smallest  closed  semi-group  D  {DieM, 0  <  t  <  oo, »  =  1, . . .,  m}. 

Let  v  be  an  invariant  measure  for  /*;  i.e.,  a  solution  of  the  integral  equation 

H*v  =  v  (3.137) 

Let  Qo  be  the  collection  of  extremal  invariant  probability  measures  of  n  on 
M  =  Sn~l  U  (0>. 

Theorem  For  all  v  €  Qo, 

rv  =  A,-  f  f  log\Diezp(At)u\e~xtdtdi/(u)  <  infty  (3.138) 

,=i  Jmjo 

and 

•-“Niks1)'*''"-  (3,139) 

for  all  zo  €  E°,  an  ergodie  component  corresponding  to  v  6  Qo.  Indeed, 
there  are  only  finite  different  values,  say,  rj  <  rj  <  ...  <  r*  =  r,  t  <  n. 
Furthermore,  if  contains  a  basis  of  Zn,  then  the  system  (3.130) 

is  asymptotically  stable  almost  surely  if  rt  <  0,  while  the  system  (3.130)  is 
asymptotically  unstable  if  ri  >  0.  In  case  r i  <  0  and  rt  >  0,  then  the  stability 
of  the  system  depends  on  the  initial  state  xq. 

To  apply  these  theorems  to  a  specific  problem,  one  must  determine  r  or  at 
least  its  sign;  or,  more  generally,  the  collection  Qo  must  be  constructed  and 
r„  computed.  If  the  semi-group  SG  is  transient  or  irreducible,  then  r„  will 
be  independent  of  v  (even  though  there  may  be  many  ergodie  components). 
(See  Theorem  4.10  and  Corollary  4.11  of  [25].)  In  this  case  a  theorem  of 
Furstenburg  ([38],  Theorem  8.6)  may  be  used  to  determine  the  sign  of  r„  =  r. 
Application  of  this  result  to  specific  systems  requires  a  close  analysis  of  the 
geometric  structure  of  the  semi-group  associated  with  those  systems.  Several 
examples  are  given  in  the  next  subsection  to  illustrate  the  techniques. 

Two  final  results  of  interest  in  engineering  practice  concern  the  occurence 
of  large  deviations  in  the  paths  of  {x(t),t  >  0}  of  a  stable  system  (3.130) 
and  the  ability  to  stabilize  a  system  like  (3.130)  with  feedback  controls. 


The  following  result  is  proved  in  section  5  of  [25]. 

Theorem  (Large  deviations).  If  the  system  (3.130)  is  asymptotically 
stable  with  r„  <  0,  then  there  exist  constants  M(x o,  R)  and  r„ A  <  7  <  0 
such  that 

P{sups  >  t|z(a)|  >  ii)  <  M(xo, R)e’lt ,t  >  0.  (3.140) 

The  constants  may  be  determined  rather  precisely,  see  [25]  for  details. 

Theorem  (Stabilisation).  The  control  system  with  state  and  control 
dependent  Poisson  noises 

dx(t)  =  Ax(t)dt  +  Bu(t)dt  +  Cx(t)dN!(t)  +  Du(t)dN2{t)  (3.141) 

is  stabilized  by  the  linear  feedback  control  u(t)  =  -Kx(t)  almost  surely  where 
K  is  any  matrix  such  that 

X 1  /"log  ||(/  +  C)e^A-BK^\\e-xtdt  (3.142) 

Jo 

+A2  /"  log  1 1(/  -  DK)e^A~BK^\\e~Mdt  <  0 
Jo 

where  A<  is  the  intensity  of  N,(t )  and  A  is  the  intensity  of  N(t)  —  Ni(t)  + 
Nj(t).  If  D  =  0  (no  control  dependent  noise )  and  ( A ,  B)  is  controllable,  i.e., 

rank  ||[B,  AB,. ..,  An~1B]  =  n 

then  (3.141)  is  stabilized  by  any  matrix  K  for  which  the  eigenvalues  of  A-BK 
lie  to  the  left  offk(s)  =  -A|  log|j/  +  Cj||  in  the  complex  plane. 


3.3.2  Examples  and  Applications. 

We  would  like  to  use  some  examples  to  show  how  to  apply  our  theorems  to 
determine  stability  properties  of  specific  systems.  As  we  shall  see,  in  many 
cases,  it  is  difficult  to  find  the  necessary  invariant  measure  because  it  is 
associated  with  an  integral  equation  with  shift  arguments.  It  is  difficult  to 
evaluate  a  solution  from  this  equation,  although  it  exists. 

Example  2.1.  Consider  the  simple  system 

*<*)-(*„  ^  )*(')<"+(  “*  “jWjiAfW  (3-143) 

where  N(t)  is  a  Poisson  process  with  intensity  A  >  0.  Then 


.At 


__  |  coswt  sinwt  \ 

-sinwt  coswt  J  ’ 


w  >  0 


g  3  •  „  “ 


m  =  ^,0<0<2* 

satisfies  (3.146).  Since  SG  is  transitive  on  S,  then  the  Haar  measure  u{9) 
with  density  f(6)  is  a  unique  invariant  measure  of  fi.  Thus, 

«V  =  /  log  |s  o  x\dfi(g)dv(x) 

JSGxS 

=  (  log|ae**|A  e~Xtdt 
Jo 

=  log|o|  +  ^. 

Consequently,  if  lb  <  Alog|a|,  the  system  (3.143)  is  asymptotically  stable, 
while  for  Jfc  >  -A  log  |a|  ,  the  system  (3.143)  is  asymptotically  unstable. 
Example  2.2  (Harmonic  oscillator  with  damping). 

Let  y(t)  be  a  point  process,  regarded  as  the  formal  derivative  of  a  Poisson 
process  N(t)  with  intensity  A.  Consider  the  second  order  system 


z(t)  +  »(*)*(*)  +  +  MOMO  =  0 

z(0),z(0)  given,  t  >  0,  w  >  0,  k  >  0. 

Let  xi(t)  =  wz(t),  X}(t)  —  z{t)  and  x[t)  =  (ii(t), i2(t)]T>  Then 

-  (  -l  (-(*/«)  i  )  -*0«W> 

*<°> = ( T(o? )  «iven- 

Set 


(3.147) 


(3.148) 


!iV»« 


D  =  I  +  B 


=  (  -(*/-)  0  ) 


cos  wt  sin  wt 
-sin  wt  cos  wt 


Let  SG  be  the  smallest  closed  semi-group  containing  DeAt,  t  >  0.  The 
probability  measure  p  on  SG  has  density  Ae_A<,  t  >  0  at  each  element  DeAt. 
Since  D  is  singular,  we  take  M  =  S°  U  0.  It  is  easy  to  see  that  the  only 
invariant  set  is 


[p i  =  ) 

(  ”  ~k  ) 

| ,  p2  =  | 

f .  *  'j 

KlIiM! 

l  1  1 

irnHWrBnfr 

]  >  M  s  \ 

Ww2  +  k2‘  y/w2  +  k2) 

* 1 '  ’  •  j 

with  invariant  measure  v  of  /*  being  defined  by 

v(pi)  =  and  ‘K0)  =  °- 

Note  that  SG  o  S°  =  E  is  invariant,  so  that  the  stability  of  the  transient 
set  F  =  S°  \  E  also  depends  on  r„  though  E  does  not  span  Z2.  (See  [25].) 
Now,  we  calculate  r„  =  r  as  follows. 

rv=/  log  \gx\dfi{g)di/(x) 

JSGxM 

=  \T,  *°g  \DeM  Pi\Xe~xtdt 

fOO 

=  /  log  I  cos  wt - sinwt|Ae~A‘dt 

Jo  01 

1  r  2k  k2  1 

=  -  /  log  cos2  wt - cos  wt  sin  wt  -I — -  sin2  wtl  A e~xtdt 


L 1<>8  D(i 5(i + ^ cos(2wt + ^ A*  x'dt 

1  x  ( 1  +  +  \  f  logfl  +  cos(2wf  +  a)]Ae_A‘dt 


then  r„  >  0  and  the  system  (3.148)  is  asymptotically  unstable;  while  for 


(3.155) 


we  have  r„  <  0  and  the  system  (3.148)  becomes  asymptotically  stable. 

Example  2.3  (Randomly  coupled  harmonic  oscillators)  (cf.  [47] 
for  m  =  1).  Let  y,y(i), i,j  =  l,...,m,  be  independent  processes  which  are 
regarded  as  formal  derivatives  of  independent  Poisson  processes  JV,-y(t)  with 
intensities  A <y,  respectively.  Consider  the  following  stochastic  system  of  m 
coupled  harmonic  oscillators. 


h(t)  +  ujzi(t)  =  bijyij(t)zj[t) 
j=» 


(3.156) 


z,(0),  ij(0)  given,  t  >  0,  w,-  >  0,  i  =  1, . . . ,  m. 

Let  *«_i(t)  =  wzj(t),  *2,(t)  =  Zi(t)  and  z  =  [xi,...,*2m]r.  Then  in  stan¬ 


dard  notation 


where 


dx{t)  =  Ax(t)dt  +  Bijz{t)dNij{t)  (3.157) 

».i=l 

where 

A  =  diag  { Au  •  •  • ,  Am},  A,  =  f  J  , 

and  all  the  entries  of  B,y  are  zero  except  the  entry  ea,2j-\  =  .  Set 

Dij  ~  I  +  Bij. 

Note  that  tr(A)  =  0  and  det(Diy)  =  1,  so  we  have  DijCxp(At)  €  SL(2m). 
We  can  define  a  measure  p  on  SL(2m)  with  density  A ,ye-**,  t  >  0,  A  = 
^«y  **  each  element  Dye**.  In  this  case,  it  is  difficult  to  determine 
an  invariant  measure  because  the  corresponding  integral  equation  is  hard  to 
solve.  However,  we  can  use  a  theorem  of  Furstenberg  (Theorem  4.12  in  [25]) 
to  show  the  rate  r  >  0.  Let 

G  =  smallest  subgroup  containing  D,yeA<,  0  <  t  <  oo,  j  =  1, . . . ,  m 
=  smallest  subgroup  containing  j  =  1, . . .,  m;  eAt,  0  <  t  <  oo. 


JL  f 1  *L 


Then  G  may  not  be  transitive  on  S2m~l.  If  we  assume  no  two  <*>,•  are 
equal,  then  the  commutant  E  of  the  smallest  subgroup  G\  containing  eAt,t  > 
0  is  isomorphic  to  C,  i.e.,  T  E  E  if 

T  =  diag  {Tj, . . . ,  rm} 


Since  TtM  =  eAtT,  and  T  and  eAt  are  normal,  they  preserve  their  eigenspace. 
Thus,  the  invariant  subspaces  V  of  G\  are  of  the  form  Z2t  x  •  •  •  x  £yt,  l  <  m. 

Before  verifying  the  hypotheses  of  Furstenberg’s  theorem,  we  need  a 
non-degeneracy  assumption: 

(A)  For  any  index  set  J  =  {ji, . . . ,  ji},  t<m,  there  exists  an  i  £  J  such 
that  bit  #  0  for  some  k  E  J. 

By  assumption  (A),  36,*  ^  0  so  that  the  entry  e2,,2fc-i  iDk)  =  Jhk/ui 
tends  to  infinity  as  j  -*  oo.  Thus,  G  is  not  compact. 

Let  an  index  set  J  =  By  assumption  (A),  3 «  ^  J  such  that 

bit  0  for  some  k  EJ.  Then  D,*V  n  V.  Hence,  G  is  irreducible. 

Note  that  Gj  is  connected.  There  is  no  finite  index  subgroup  of  Gj. 
Thus,  any  finite  index  subgroup  H  of  G  must  contain  Gi  and  some  mixed 
powers  of  {Dy}.  Moreover,  the  irreducibility  of  G  is  due  to  sufficiently  more 
non-zero  entries  of  D;J  t  not  the  exact  value  6,y,  so  H  is  also  irreducible. 

In  the  cases  where  some  u>i  are  equal.  The  commutant  E  properly  con¬ 
tains  C  and  the  invariant  subspaces  of  Gi  are  much  more  complicated. 

Consequently,  by  Furstenburg’s  Theorem  ([38],  Theorem  8.6),  r„  =  r  >  0 
and  x(t)  grows  exponentially  a.s.  This  implies  that  all  the  states  of  all 
subsystems  grow  exponentially. 

Remark.  If  assumption  (A)  does  not  hold,  the  system  can  be  subdivided 
into  proper  subsystems  E,-,  which  have  property  (A),  and  E.  States  of  E,- 
grow  exponentially  a.s.  by  the  above  arguments.  The  remaining  subsystem 
E  depends  on  E,-  and  its  state  thus  grows  exponentially  a.s.  Hence,  the 
system  of  n  coupled  harmonic  oscillators  is  asymptotically  unstable. 

Example  2.4  (Random  telegraph  wave). 

Let  z(t)  be  random  telegraph  wave  which  takes  on  the  value  set  Z  — 
{-1, 1}  with  transition  probability  satisfying 


W 


8 


Then  the  differential  equation  for  z(t)  becomes 

dz(t)  =  -2z{t)dN(t)  (3.158) 

z(0)  =  ±1 

where  N(t)  is  a  Poisson  process  with  intensity  A.  If  we  consider  the  state 
process 

dx(t)  =  [it  +  wz(t)]x(t)dt  (3.159) 

x(0)  =  xa,u  >  0,t  >  0, 

then  using  (3.158),  (3.159)  and  the  fact  z2(t)  =  1,  we  get 


d{zx)  =  dzx  +  zdx 


(3.160) 


=  -2  zxdN  +•  z(k  +  uz)xdt 
=  wxdt  +  kzxdt  -  2zxdN. 


Combining  (3.159)  and  (3.160),  we  have 


) - (5 ) ( ; )*+ ( s  -* )  ( l 


Then, 


_  kt  (  cosh  uit  sinh  uit  \ 
l  sinh  wt  cosh  wt  J  ' 


expAt 

D  =  I  +  B 


Let  SG  be  the  smallest  closed  semi-group  containing  DeAt,  0  <  t  <  oo  and 
the  measure  p  is  defined  on  SG  with  density  Ae~At,  t  >  0  at  each  element 
DeAt.  The  corresponding  invariant  measure  v  is  difficult  to  calculate  exactly 
and  may  not  be  unique  since  SG  is  not  transitive  on  the  circle  S°.  However, 
SG  is  irreducible.  By  Furstenburg’s  Theorem,  the  rate  r  is  independent  of 


v. 

Let 


X(t)  =  DeM  =  e“ 


( 


coshwt 
—  sinh  w  t 


sinhwt  \ 

-  cosh  wt  I  ’ 


then 


||*(«)||a  =  ew(cosh2wt  +  sinl^u/t)1/2  =  e^k+u^. 


and 


f. 


-i 


.V 


i 


,v 


z 


.V 


s 


/• 


V, 


/, 

V* 


*■1=/  log||X(t)||2Ae  A‘dt 
Jo 

=  f  (Jfc  +  u)tXe~xtdt 
Jo 


k  +<j 


Again,  we  calculate 


x(t\  X(t .1  -  MW*)  (  coshw(*i  -  t2)  sinhw(ti  -  t2 )  \ 
'  '  '  ^  sinhw^i  -  r2)coshw(ti  -  t2)  ) 


with 


||X(t2)X(ti)||2  =  e*(<1+*»)[cosh u(ti  -  t2)  +  sinhu;(fi  -  <2)] 
—  gk(ti+h)ew(fi-ti), 


so  that 


r2=  r  /“log||X(t2)X(t1)!|2Ae-Af*dt1Ae-At»dt2 
Jo  Jo 

—  f  f  [fc($i  +  t2)  +  w(ti  —  t2)]Ae_AtldtiAe  *t2dt2 
Jo  Jo 


=4 


In  general, 

rt  =  /°°  •  •  •  r  logJjJST(t<)  •  •  •  XMHjAe-^dt!  •  • .  A e"A‘‘dt* 

Jo  Jo 

_  f  f ,  t  is  odd 

]  l j,  t  is  even 


Thus, 


rt  k 

r  =  urn  —  =  — . 
1-00  l  X 


From  (3.161),  we  know  that  stability  of  (3.159)  is  equivalent  to  that  of 
(3.161).  Hence,  the  system  (3.159)  is  asymptotically  stable  for  jfc  <  0  while 
it  is  asymptotically  unstable  for  k  >  0.  This  result  shows  that  the  ran¬ 
dom  telegraph  process  z(t)  does  not  affect  the  stability  of  the  corresponding 
deterministic  system. 
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Chapter  4 

Simultaneous  Detection  and 
Estimation  for  Diffusion 
Process  Signals 

4.1  Abstract 

We  consider  the  problem  of  simultaneous  detection  and  estimation  when 
the  signals  corresponding  to  the  M  different  hypotheses  can  be  modelled  as 
outputs  of  M  distinct  stochastic  dynamical  systems  of  the  Ito  type.  Under 
very  mild  assumptions  on  the  models  and  on  the  cost  structure,  we  show 
that  there  exists  a  set  of  sufficient  statistics  for  the  simultaneous  detection* 
estimation  problem  that  can  be  computed  recursively  by  linear  equations. 
Furthermore,  we  show  that  the  structure  of  the  detector  and  estimator  is 
completely  determined  by  the  cost  structure.  The  methodology  used  em¬ 
ploys  recent  advances  in  nonlinear  filtering  and  stochastic  control  of  partially 
observed  stochastic  systems  of  the  Ito  type.  Specific  examples  and  applicar 
tions  in  radar  tracking  and  discrimination  problems  are  discussed. 


4.2  Introduction 

In  a  typical  present  day  radar  environment,  the  radar  receiver  is  subjected 
to  radiation  from  various  sources.  A  very  important  function  of  the  radar 
receiver  is  its  ability  to  discriminate  between  the  various  waveforms  received 
and  select  the  desired  one  for  further  processing.  Furthermore,  an  equally 
important  function  of  the  receiver  is  to  estimate  important  parameters  of  the 
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radiating  source  from  the  received  waveforms.  Thus  the  receiver  is  required 
often  to  perform  a  “combined  detection  and  estimation”  function. 

An  abstract  formulation  of  the  combined  detection  and  estimation  prob¬ 
lem  in  the  language  of  statistical  decision  theory  has  been  developed  by 
Middleton  and  Esposito  in  [1].  They  correctly  point  out  that  optimal  pro¬ 
cessing  in  such  problems  often  requires  the  mutual  coupling  of  the  detection 
and  estimation  algorithms.  Although  from  the  mathematical  point  of  view 
estimation  may  be  considered  as  a  generalized  detection  problem,  from  an 
operational  point  of  view,  the  two  procedures  are  different:  e.g.,  one  usually 
selects  different  cost  functions  for  each  and  obtains  different  data  processors 
as  a  result.  It  is  then  correctly  argued  in  [1]  that  it  is  practically  appropriate 
to  retain  the  usual  distinction  between  detection  and  estimation.  There  are 
various  ways  that  the  detector  and  estimator  can  be  coupled  leading  to  a 
hierarchy  of  complex  processors.  We  describe  here  some  important  cases. 

4.3  Detection-Oriented  Estimation 


Here,  the  detection  operation  is  optimized  with  a  priori  knowledge  of  the 
existence  of  an  estimator  following  it.  The  estimator  is  dependent  on  the 
detector’s  decision  by  being  gated  on  only  if  the  detector  decides  that  the 
desired  signal  is  present.  Here,  the  coupling  is  via  cost  terms  that  assess 
the  performance  deterioration  when  the  estimator  is  turned  off  while  the 
signal  is  present  C(,i,  or  the  estimator  is  turned  on  while  the  signal  is  not 
present  C#1  <).  Therefore,  the  average  risks  corresponding  to  the  operations 
of  detection  and  estimation  can  be  minimized  separately.  This  leads  to 
a  detection  test  that  is  a  modified  generalized  likelihood  test.  If  the  cost 
terms  C,(i,  Ct> o  are  constant,  the  coupling  just  reduces  to  a  modification 
of  the  threshold  [1].  Since  the  detector’s  decision  rule  does  not  depend  on 
the  estimate,  the  structure  of  the  optimal  estimator  is  not  a  function  of  the 
data  region  specified  by  the  decision  rule  of  the  detector’s  operation,  when 
the  detector’s  decision  is  to  accept  the  signal.  In  practical  terms,  this  means 
that  we  can  choose  to  estimate  only  when  the  detector  has  decided  that  the 
desired  signal  is  present. 


4.4  Coupled  Detection-Estimation  with  Decision 
Rejection 

Here,  detection  and  estimation  run  in  parallel  and  are  followed  by  rejection 
of  the  estimate  if  the  detector’s  decision  is  not  to  accept  the  signal.  Here,  the 
detector’s  cost  depends  on  the  value  of  the  estimate.  Typically,  one  solves 
the  detection  problem  knowing  the  estimator.  Then  a  second  optimization  is 
performed  over  all  estimators.  This  case  usually  results  in  relatively  simple 
estimators  and  complex  highly  nonlinear  detectors  [l]. 

Motivation  for  these  problems  stems  from  distributed  target  problems, 
see  in  particular  [2]*[7j. 

We  concentrate  in  this  section  on  a  two  hypotheses  detection  formulation, 
but  it  is  clear  that  the  methods  can  be  easily  extended  to  M-ary  detection 
problems.  The  two  hypotheses  are  Ho  =  the  received  signal  is  a  process 
yot  plus  noise.  Hi  —  the  received  signal  is  a  process  yjt  (different  from  yot) 
plus  seise.  Beth  processes  are  modeled  as  outputs  of  stochastic  dynamical 
systems  of  the  diffusion  type.  The  noise  is  the  same  in  both  cases.  Due  to 
this  fact,  we  can  assume  that  noise  is  eliminated  from  the  mathematical 
formulation  of  the  problem  of  detection,  while  as  we  shall  see  its  presence 
may  be  crucial  for  the  estimation  problem. 

We  did  not  study  detectors  with  “learning’’,  and  we  suggest  this  is  a 
promising  extension  of  the  results  reported  here.  We  note,  however,  that 
our  formalism  includes  general  “learning”  algorithms.  Most  of  the  work  on 
detectors  with  “learning”  is  problem  specific  and  does  not  utilize  dynamical 
system  models  for  the  signals  as  we  do.  The  major  criticism  for  the  work  of 
Middleton  and  Esposito  [1]  is  that  although  they  used  a  Bayesian  approach 
to  the  estimation  problem,  they  considered  nonrecursive  solutions  and  de¬ 
tection  was  coupled  to  estimation  through  cost  structure  which  explicitly 
considers  coupling  of  the  detection  and  estimation  costs.  Clearly  nonrecur¬ 
sive  solutions  are  not  appropriate  for  advanced  sensors  employed  in  guided 
platforms.  Furthermore,  it  would  be  unrealistic  to  assume  that  the  designer 
has  such  explicit  knowledge  of  the  functional  couplings  between  detection 
and  estimation  costs. 

Several  other  authors  have  analyzed  the  problem.  Scharf  and  Lytle  [13] 
studied  detection  problems  involving  Gaussian  noise  of  unknown  level,  thus 
including  noise  parameters  in  the  problem.  As  in  [1],  their  solution  is  also 
nonrecursive  and  focuses  on  the  existence  of  uniformly  most  powerful  tests. 
Spooner  [14],  [15]  considered  in  detail  unknown  parameters  in  the  noise 
model.  Jaffer  and  Gupta  [16],  [17]  consider  the  recursive  Bayesian  problem 


using  a  quadratic  cost,  Gauss-Markov  processes  and  estimating  only  signal 
parameters.  Birdsall  and  Gobien  [18]  considered  the  problem  of  simultane¬ 
ous  detection  and  estimation  from  a  Bayesian  viewpoint.  This  work  is  close 
in  spirit  with  our  approach,  although  the  class  of  problems  we  can  analyze  by 
our  methods  is  significantly  wider.  We  also  follow  a  Bayesian  methodology 
during  the  initial  phase  of  analysis.  It  becomes  clear  that  by  using  Bayesian 
methods  one  can  analyze  the  problems  under  consideration  in  an  inherently 
intuitive,  simple  conceptual  manner  which  can  be  easily  obscured  in  highly 
structured  methodologies  utilizing  specific  detector  structures  and  cost  rela¬ 
tionships.  As  a  result,  one  can  analyze  the  special  problems  described  earlier 
as  specializations  of  a  wider  picture  and  framework.  The  results  reported  in 
[16]  are  limited  by  two  important  assumptions:  (a)  the  observed  data  have 
densities  that  display  finite  dimensional  sufficient  statistics  under  both  hy¬ 
potheses  for  the  unknown  parameters,  and  (b)  the  unknown  parameters  form 
a  finite-dimensional  vector.  Both  nonsequential  and  sequential  problems  are 
analyzed  in  [18].  The  most  important  result  of  [18]  is  the  proof  that  through 
a  Bayesian  approach  both  estimation  and  detection  occur  simultaneously, 
with  the  detector  using  the  a  posteriori  densities  generated  by  two  sepa¬ 
rate  estimators,  one  for  each  hypothesis.  A  particularly  attractive  feature  is 
that  no  assumptions  are  made  on  the  estimation  criterion  and  very  flexible 
assumptions  are  made  on  the  detection  criterion.  When  finite-dimensional 
sufficient  statistics  exist,  the  optimum  processor  partitions  naturally  into 
three  parts:  a  “primary”  processor  which  is  totally  independent  of  a  priori 
distributions  on  the  parameters,  a  “secondary”  processor  which  modifies  the 
output  according  to  the  priors  and  solves  the  detection  problem,  and  an  es¬ 
timator  which  uses  the  output  of  the  other  two  in  estimating  the  unknown 
parameters.  Only  the  estimator  structure  depends  on  cost  functionals. 

Since  dynamical  system  models  are  not  utilized  to  represent  signals  in 
[18],  there  is  great  difficulty  in  analyzing  the  far  more  interesting  sequen¬ 
tial  problem.  It  is  for  this  reason  that  one  is  forced  to  make  the  limiting 
assumptions  mentioned  above.  In  our  approach,  we  consider  diffusion  type 
models  for  the  signals,  and  we  utilize  modern  methods  from  nonlinear  filter¬ 
ing  and  stochastic  control  to  analyze  the  problem  [19]-[23].  Corresponding 
results  for  Markov  chain  models  can  be  easily  obtained,  but  we  only  give 
brief  comments  for  such  problems  here. 


4.5  Nomenclature  and  Formulation  of  the  Sequen¬ 
tial  Problem 

In  this  section,  we  present  a  general  formulation  for  the  continuous  time, 
sequential,  simultaneous  detection  and  estimation  problem  when  the  Jgnals 
can  be  represented  as  outputs  of  diffusion  type  processes  [20].  To  simplify 
notation,  terminology  and  subsequent  computations,  we  consider  only  the 
scalar  observation  case  here.  All  results  extend  to  vector  observations  in 
a  straight-forward  manner.  The  observed  data  y(t)  constitute,  therefore,  a 
real-valued  scalar  stochastic  process. 

The  statistics  of  y(-)  are  not  completely  known.  More  specifically,  they 
depend  on  some  parameters  and  some  hypotheses.  For  simplicity,  we  shall 
consider  here  only  the  binary  hypotheses  detection  problem.  Extensions  to 
M-ary  detection  are  trivial.  We  shall  denote  by  H0,  Hi  the  two  mutually 
exclusive  and  exhaustive  hypotheses. 

Under  hypothesis  Ho,  the  received  data  y(t)  can  be  represented  as: 

dy(t)  =  h0(a°(t),$°)dt  +  dv(t)  (4.1) 

dx°{t)  =  /  °(z°(t),  6°)dt  +  go(zo{t),0°)dwo{t) 

where  0°  is  a  vector-valued  unknown  parameter  that  may  be  assumed  fixed 
or  random  throughout  the  problem.  Here  v(-),  w(-)  are  independent,  1- 
dimensional  and  no-dimensional,  respectively,  standard  Wiener  processes 
[20].  In  other  words,  when  hypothesis  Ho  is  true,  the  received  data  can  be 
thought  of  as  the  output  of  a  stochastic  dynamical  system,  corrupted  by 
white  Gaussian  noise.  h0,f°,g0,$°  parameterize  the  nonlinear  stochastic 
system. 

Similarly,  when  hypothesis  Hi  is  true,  the  received  data  y(t)  can  be 
modelled  as 

dy(t)  =  hl(xt(t),01)dt  + dv(t)  (4.2) 

dxl{t)  =  fl{t),0l)dt  +  g\xl(t),9x)dw\t) 

where  now  z1  is  ni-dimensional.  The  vector  parameters  $°,$l  may  have 
common  components.  For  instance,  in  the  classical  “noise  or  signal-plus- 
noise”  problem,  any  noise  parameters  clearly  appear  in  both  hypotheses 
and  would  thus  be  common  to  0°,01. 

We  note  that  we  have  the  same  “observation  noise”  v(-)  under  both 
hypotheses.  This  is  clearly  the  case  in  radar  applications  (see  [6]).  On 
the  other  hand,  when  one  is  faced  with  state  and  parameter  dependent 


observation  noises,  a  simple  transformation  translates  the  two  models  in  the 
form  (4.1)  (4.2).  We  shall  assume  that  h*,/*,y*,»  =  0,1,  have  sufficient 
properties  to  guarantee  existence  and  uniqueness  of  probability  distribution 
functions  for  y(-)  under  either  hypothesis.  As  a  minimal  hypothesis,  we 
assume  that  the  martingale  problems  for  (4.1)  and  (4.2)  are  well  posed 
[24]  for  all  values  of  0°,0l  in  appropriate  compact  sets  0°,  01,  respectively. 
Furthermore,  neither  (4.1)  nor  (4.2)  exhibit  explosions  [24]  for  any  value  of 
the  parameters.  Often  we  shall  make  stronger  assumptions  such  as  existence 
of  strong  solutions  to  (4.1)  (4.2),  or  smoothness  of  /',y',h*,i  =  0,1,  or 
existence  of  classical  probability  densities  for  y ,•  under  either  hypothesis. 

We  shall  denote  by  p^(-,  t  j  O'),  i  =  0,1,  the  probability  density  of  y(t) 
under  hypothesis  H '  and  when  the  parameter  obtains  the  value  0*,»  =  0, 1. 
We  shall  denote  the  probability  measures  corresponding  to  y  under  H°  or 
Hl  by  /ijj,  respectively.  As  is  well  known,  these  are  measures  on  the  space  of 
continuous  functions  [24].  Finally,  we  note  that  although  we  have  assumed 
time  invariant  stochastic  models  in  (4.1),  (4.2)  the  results  extend  easily  to 
the  time  varying  case. 

Following  a  Bayesian  approach,  we  assume  a  priori  densities  for  the  two 
parameters  O°,0l  which  will  be  denoted  by  pj(-,0),»  =  0,1  respectively. 
Similarly  initial  densities  for  z°(0)  and  x*(0)  are  assumed  known  and  inde¬ 
pendent  of  0°,0l,  respectively.  They  will  be  denoted  by  p£(-,0).  The  choice 
of  these  a  priori  densities  is  frequently  a  very  interesting  problem  in  applica¬ 
tions,  as  they  represent  the  designer’s  a  priori  knowledge  about  the  models 
used. 

With  these  preliminaries,  we  can  now  formulate  the  problem.  Let  y‘ 
denote  as  usual  the  portion  of  the  observed  sample  path  “up  to  time  t”, 
i.e.,  y‘  =  (y(s),  a  <  t}.  Given  the  observed  data  y1,  we  wish  to  design  a 
processor  which  at  time  t  will  optimally  select  simultaneously  which  of  the 
two  hypotheses  Ho  or  Hi  is  true,  and  optimal  estimates  for  the  parameters 
0°  and  $l.  Moreover,  the  processor  should  operate  recursively  so  as  to  permit 
real-time  implementation. 

To  complete  the  problem  formulation,  we  need  to  specify  costs  for  detec¬ 
tion  and  estimation.  Let  c,-(0,(t),0*),i  =  0, 1  be  the  penalty  for  “estimating” 
O',  by  0*(t)  at  time  t.  If  Cj  is  quadratic,  we  have  the  well  known  minimum 
variance  estimates.  Similarly,  let  7(t)  denote  the  decision,  at  time  t,  of 
wheter  we  declare  hypothesis  Ho  or  Hi  to  hold.  Then  Jb(7(t),t),i  =  0, 1  will 
denote  the  penalty  when  the  true  hypothesis  is  Hi  and  we  decide  7(t),  at 
time  t.  Obviously,  there  are  infinitely  many  variations  on  the  possible  choice 
for  a  cost  function.  We  shall  consider  only  two  possibilities  in  this  report. 


Finite  time  average  integral  cost 


Jf  =  £{jfT  Aec0(fl°(t),  0°)X{t,  i(t)  =  0} 

+  ci^HO.  •‘W.  7(0  =  1  }dt  +  Adfc(7(t),  t)dt) 


and  infinite  time  average  discounted  cost. 


Jd  =  E{J~  C('y,fl°l^,x)e-“*<ft> 


where  C(7,0°,  d1,*)  is  the  integrand  in  (4.3)  and  a  the  discount  rate.  Ae,  Aj 
are  weights.  The  reasons  for  the  characteristic  functions  appearing  in  (4.3), 
(4.4)  are  rather  obvious.  The  estimator  will  contribute  cost  only  when  uti¬ 
lized,  and  it  will  be  utilized  for  6°  only  when  -y(t)  =  0.  We  would  like  to  point 
out  that  this  does  not  preclude  both  estimators  from  running  continuously. 
This  scheme  is  used  only  to  assess  costs  properly. 

The  appropriate  formulation  of  the  problem  is  as  a  partially  observable 
stochastic  control  problem.  The  admissible  controls  are 


where  all  functions  are  nonanticipative  with  respect  to  y;  i.e.,  measurable 
w.r.  to 

1  ('h&VJ'MeF?  (4.6) 

The  cost  is  either  (4.3)  or  (4.4).  For  the  system  dynamics,  we  proceed  as 
follows.  The  state  equations  are  mixed  consisting  of  the  continuous  compo¬ 
nents 


dz°(t) 

d*‘(t) 

Ml{t) 


/V(t),«°(t))dt  +  g0(z0(t),ff°(t))dw0(t) 


and  the  discrete  component  z(t)  which  can  take  only  the  values  0  or  1  and  is 
constant.  The  initial  densities  for  z°,  z1, 0 °,  01  have  already  been  described. 
The  initial  probability  vector  for  z(t)  (which  tracks  which  hypothesis  is  true) 
is 

Pr{z(0)  =  0}  =  P0,  Pr{z{ 0)  =  1}  =  Px  (4.8) 

The  observations  are 

dy(t)  =  (1  -  z(t)h°{x°(t),6°)dt  +  z^h^x^O^dt  +  dv(t)  (4.9) 

Since  (4.7)  are  degenerate,  there  are  some  technical  minor  difficulties, 
which  can  be  circumvented,  however,  using  recent  techniques.  This  com¬ 
pletes  the  formulation  of  the  problem. 


4.6  Structure  of  the  Optimal  Processor 

Following  recent  results  [25]-[29]  in  stochastic  optimal  control  theory,  we 
have  obtained  first  the  following  results  that  reduce  the  partially  observed 
stochastic  control  problem  described  in  Section  4.5  to  an  equivalent,  infinite 
dimensional  fully  observed  problem. 

Theorem  1:  There  exist  optimal  for  the  stochastic  optimal  con¬ 

trol  problem  (4.3)  -  (4.9). 

Proof:  This  follows  from  the  results  of  Fleming  and  Pardoux  [27]  and 
Bismut  [29].  The  only  difference  is  that  due  to  the  structure  of  the  dynamics 
here  (i.e.,  they  do  not  depend  on  the  controls  7,0°,  01)  we  can  show  that 
optimal  controls  exist  in  the  class  of  strict  sense  controls  as  specified  in 
Section  4.5  (i.e.,  )  are  measurable  with  respect  to  FJ). 

We  then  introduce  as  in  Fleming  and  Pardoux  [27]  the  associated  “sep¬ 
arated”  stochastic  control  problem.  In  the  separated  stochastic  control 
problem,  the  state  at  time  t  is  a  measure  A<  on  RN  (where  N  =  no  + 
ni  +  2),  which  is  un  unnormalized  conditional  distribution  of  the  state 
x(t)  =  [z<j(t),zi(f),  0o(t),  0i(t),  z(t)]T  of  the  problem  formulated  in  Section 
4.5.  The  dynamics  of  the  measure-valued  process  At  obey  the  Zakai  equation 
of  nonlinear  filtering  [26]-[3lj,  and  [20]. 

In  the  sequel,  we  assume  that  all  functions  appearing  in  (4.1)  -  (4.9)  are 
bounded  and  continuous  and  that  g°,  Jx  are  Lipschitz  in  z°,0®,*l,0l, 
respectively.  Due  to  the  discrete  component  z(t)  of  the  state  x(t),  we  have 
to  consider  a  two-dimensional  measure  valued  process  A°,  A1,  where  A'  is 
the  unnormalized  conditional  distribution  of  the  state 

x(t)  =  [zo(t),*i(t),0o(O,0i(O] 


J. 


(slight  abuse  of  notation  here)  when  hypothesis  Hi  is  true,  t  =  0, 1.  We  fur¬ 
ther  assume  that  for  i  =  0, 1,  the  corresponding  Zakai  equation  has  a  unique 
solution  which  is  absolutely  continuous  with  respect  to  Lebesque  measure; 
i.e.,  we  assume  the  existence  of  conditional  unnormalized  probability  densi¬ 
ties  for  x(t)  €  RN  given  y*.  For  results  on  this,  see  [30],  [31]. 

Let  u*(x,  t)  denote  the  conditional  probability  density  of  x(t)  given  y‘ 
when  hypothesis  Hi  holds.  Then  u’(*,  •)  satisfies  the  Zakai  equation 


dux  =  L\u'dt  -(-  dy(t)h,u,,t  =  0, 1 


(4.10) 


where  L*  is  the  formal  adjoint  to  the  infinitesimal  generator  of  the  ith  com¬ 
ponent  of  (4.7);  i.e.,  it  has  the  form 


,  U  N  a 

!>;<*)£ 


(4.11) 


a*  =  tf’(<r*)T,<r'  =  9'  0  =  I  f*  0 

V  /  ,v  nn  no 


(4.12) 


To  complete  the  description  of  the  “separated”  stochastic  control  problem, 
let  C(7, 0°,  01,  x)  denote  the  integrand  in  the  cost  definition  (4.3).  Then  if 
we  let 


ufrtl-fU°  (x°’ *<>’*) 


(4.13) 


we  can  rewrite  the  cost  (4.3)  as 

=  Ey{ £  I  C(7,  0°,  9\  x)[u(x,  t)T  }dxdt }  (4.14) 

where  x  is  the  policy  corresponding  to  a  particular  selection  of  l{‘),  0°(), 
0l(-),  and  Ey  is  expectation  with  respect  to  y.  Note  that  u  depends  explicitly 
on  y. 

The  separated  problem  is  to  choose  a  policy  x  which  is  a  function  of 
u°,  ul  to  minimize  (4.14).  This  is  a  fully  observed  problem  since  u°,  u1 
satisfy  (4.10)  and  enter  directly  into  (4.14).  We  then  have  the  following 
very  important  result: 

Theorem  2:  Under  the  above  assumptions,  the  optimal  7, 00,®1  (which 
exist  according  to  Theorem  1)  are  functions  of  u°,  u1  only.  That  is,  they 
depend  on  y*  only  through  the  unnormalized  conditional  densities  u°,  u1. 


space  of  solutions  of  (4.10). 

Proof:  The  result  is  rather  technical.  A  complete  proof  will  be  given 
elsewhere.  It  follows  by  appropriate  modifications  to  the  results  of  [26],  [32]. 

This  result  opens  the  way  for  promising  electronic  implementation  of  the 
optimal  processor  by  the  following  steps:  (1)  solve  numerically  the  resulting 
variational  inequality  using  the  methods  of  [33],  (2)  implement  the  resulting 
numerical  algorithm  by  a  special  purpose,  multiprocessor,  VLSI  device  along 
the  lines  of  [34].  In  simple  cost  cases,  explicit  solutions  of  the  variational 
inequality  can  be  obtained,  of  course. 

4.7  Motivation  and  Examples  from  Radar  Track¬ 
ing  Loops 

The  primary  motivation  for  the  mathematical  problem  studied  in  Section 
4.6  comes  from  design  consideration  of  advanced  (smart)  sensors  in  guided 
platforms.  To  be  more  specific,  let  us  consider  radar  sensors. 

The  radar  return  from  a  scatterer  carries  (depending  on  the  radar  so¬ 
phistication)  significant  information  about  a  scatterer.  For  example,  range, 
Doppler  extend,  shape  and  extend,  motion,  of  a  scatterer  can  be  extracted 
from  a  radar  return  by  appropriate  processing.  In  today’s  dense  environ¬ 
ment,  a  very  important  function  of  an  advanced  processor  is  classification  of 
scatterers.  This  function  is  required,  for  example,  by  sensors  participating 
in  a  surveillance  network  (since  threats  must  be  classified,  so  that  appropri¬ 
ate  response  can  be  applied),  in  electronic  warfare  (since  decoys  and  other 
counter-measures  can  be  designed  to  emulate  target  characteristics)  and  in 
tracking  radars  (since  the  sensor  often  must  develop  a  tracking  path  for  a 
designated  priority  target). 

A  related  equally  important  function  of  a  radar  receiver  is  the  estima¬ 
tion  of  parameters  embedded  in  the  return  signal.  For  example,  pulse  length, 
pulse  repetition  frequency,  amplitude  scintillation  spectrum,  conical  scan  fre¬ 
quency,  antenna  pointing,  surface  roughness.  The  two  problems  of  detection 
and  estimation  are  indeed  closely  related,  as  explained  earlier. 

In  our  earlier  work  [2]- [5],  we  have  developed  statistical  models  for  dis¬ 
tributed  scatterers  which  can  represent  accurately  phenomena  characteristic 
of  distributed  scatterer  radar  returns  such  as  amplitude  scintillation  and  an¬ 
gle  noise  or  glint.  In  addition,  we  have  developed  similar  statistical  models 
for  the  effects  of  multipath  on  radar  returns,  for  sea  clutter  returns  and  for 


chaff  cloud  returns.  The  models  developed  in  [2]- [5]  are  of  the  form 


dx(t)  =  A(tt$)x(t)dt  +  B(t,0)dw(t)  (4.15) 

dy(t)  =  h[t,  x(t),0)dt  +  dv(t) 

Furthermore,  A,  B,  h  are  piecewise  constant  with  respect  to  time  since 
the  models  developed  in  [2]-[5]  are  piecewise  stationary.  For  example  in  [2], 
we  used  models  like  (4.5)  to  describe  the  RCS  scintillation  for  ships.  The 
same  type  models  can  be  used  for  other  distributed  targets  such  as  tanks  or 
armored  vehicles.  For  example,  when  the  return  appears  spiky,  indicating 
higher  probability  of  strong  return,  an  appropriate  model  is  provided  by 
a  lognormal  process,  where  x(-)  in  (4.15)  is  scalar  and  h  is  chosen  to  be 
an  exponential  function  of  x.  For  chaff  clouds,  a  more  appropriate  model 
is  provided  by  a  Rayleigh  process,  where  x(-)  is  two  dimensional,  with  the 
two  components  being  identically  distributed,  independent  Gaussian  random 
processes  and  _ 

h[t,  x(t),  9)  -  \] x\{t)  +  x\{t) 

Clearly  then,  in  target  discrimination  problems  with  distributed  targets 
of  this  type,  one  encounters  problems  like  those  treated  in  Section  4.6.  It  is 
important  to  note  that  since  the  first  of  (4.15)  is  linear,  the  corresponding 
filtering  and  stochastic  control  problems  described  in  Section  4.6  are  defi¬ 
nitely  more  tractable.  For  further  examples  of  this  type,  we  refer  the  reader 
to  (2]- [5]. 

Further  research  is  needed  to  apply  the  powerful  results  of  Section  4.6 
to  specific  problems  in  order  to  evaluate  current  design  principles  and  more 
importantly,  in  order  to  suggest  new  electronic  implementations  capable  of 
performing  in  a  dense,  hostile  environment.  In  particular,  the  methodology 
developed  in  Section  4.6  can  be  used  to  identify  the  cost  structures  that  lead 
to  the  specific  hierarchies  suggested  in  the  introduction. 
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